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ABSTRACT ' . .. 
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FOREWORD 



Working With Evattmtors represents, a significant advance in the development of scientifically tested 
prevention programs that meet the heeds of parents, schools, youth and communities. This volume has been % 
written to assist prevention prografri stitff to work cooperatively and effectively with e valuators and 
researchers to apply their skills, knowledge^and sensitivities in the design and implementation of .noteworthy 
evaluations. v 

The prevention field *has taken signifi&mf ttrScjes forward relevant to evaluation by freaking through the 
resistance and fear o^evaluative findings that have proven to be so Jypieal^f social programing. In contrast 
the field of prevention clearly, recognizes and accepts the tenet that if the field is to continue to develop 
and to emerge in the 1980's as a scientific discipline* this evolution will be based in part oh the knowledge 
gained from evaluative research arid program evaluation. 

The development of this 'volume and* more importantly the National Prevention Evaluation Resource 
Network (NPERN), cogently illustrate the many positive benefits to be derw^ 

projects. As a result of th^ consortium of States (Wisconsin, New Jersey, Pennsylvania) involved in that 
effort, a system for evaluation had been created that is sensitive and responsive to the unique evaluation 
needs of State and local prevention programs without imposing constraints or inapplicable s^ndpr Just as 
sound evaluation results from the partnership^ a well trained evaluator and a filled program staff, so too 
will effective prevention programs result from ihe partnership of States, communities* families* parent s and 
the Federal Government. • \ s * ) 



William J. Bukoski, Ph.D. 
Research Psychologist 
Prevention Research Branch 
Di vis ion of Clinical Jteseareh \ 
National Institute on Drug Abuse 
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PREFACE 



The i*al pdeasurf in (disease) prevention 
\\* in Watching nothing happen. ' 



/ • Donald^Wlllar, M.D. 

• _ Centers for Disease Control 

\ e— (New Yorjc Times, Jan. 20, 1980) 

•■ The real pleasure in evaluation is in w^tchirffc p 
thus helping learn to make - it happen. 

Anonymous - 

-And the real pleasure in creating this monograph was iq working with and 

through a stimulating network of pebple. In addition to the author* and ectftors* many people contributed 
significantly to help shape the monograph. 

• Early outlines of the monograph were reviewed in depth J>y David Twain, Rutgers University Graduate 
School of Criminal Justice^ fend Nancy Kaufmanh, Wisconsin Bureau of_ Alcohol arid_ Other Drug Ab^e. A 
final- outline was prepared in a two-da^ intensive work group attended by most of the contributing authors, 
and editorial staff. , 



Following submission of several chapter drafts [ by each contributor, a five-member national eonsujner 
review group of prevention and evaluation practitioners was convened* selected jwith assistance from the 
National Institute on 'Alcohol Abuse and Alcoholism, the National Institute on Dni^ Abuse, thte National 
% Council on Alcoholism, the Center for Multicultural Awareness, and many individual prevention specialists. 

the review grf up included Barbara Beit of the New Jersey Divisibh of Narcotic and Drug Abuse i Control; 
Barbara Kline of fhe Rock Island (Illinois) County Council on_ Alcoholism ; Patrick Ogawa of the Japanese- 
American Cultural and C^mln^ Los Angeles; Carol Stein of the National Federation of Parents 
for Drug Free Yiftith; and Richard Stephens, Cleveland (Ohio) State University. 

The consumer review members each independently read and critiqued the T first full draft of the 
monograph, then met with the editors as a group ±6 consolidate suggested changes, and Lreviewed [ a second 
draft incorporating their suggestions. Hugh Clihe of the Educational Testing Service provided an 
independent technical review^ - * 

v / ? 

* _ ^_ _ _ _* ^ 

John F. French Court C. Fisher Samuel J. Costa, Jr. 

\ ■ ' V, : New Jersey Department ^ of Health 

* * v Alcohol Narcotic Qnd Drug Abuse Unit 
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CHAPTER 1: INTRODUCTION 

(What It's Mostly All About) 

: ' •• i 

Evaluation is about _ - ■' ■ 

participation 

empowerment ~> 
learning 

survival. . • *■ 

4 0 These are hot words we normally associate with ^valuation, but they are elements of the basic purpose and 
message of this monograph: to foster participation, empowerment, learning, amj survival in alcohol and drug 
abuse prevention programs." * € * 

• i ' ' • 

. This is a _ monograph abput^valuation ^ prevention programs. 

It is a product of the National Prevention Evaluation Resource Network (NPERN). 

NPERN is a program of the Federal Department of Health and Human Services* National institute on 
Drug Abuse (NID A). In^ 1978 the Prevention Branch of NIB A started NPERN to improve the number and 
quality of evaluations copducted_by and about drug abuse prevention programs. The National Institute on 
Alcohol Abuse and Alcolfclism L (NIAAA) later added its support to encourage greater- feecess by alcoholism 
prevention programs to Evaluation resources. . 

NPERN works priority by bringing experienced I evaluate^ 
^programs i_te help the programs meet evaluatidn needs. This direct on-site technical assistance was provided 
first in a 1^78 pilot project, in six States. A larger scale national technical assistance phase operated 
through 1981. ' * 

'As part of NPlRbTs program several publications were also written and .published** A Handtobk-fbr 
<Prevention Evaluation is a summary of evaluation feip^ledge and technique applied 1 to the prevent Ion field 
. ' and is written primarily for evalua tors. This monograph, Working with Evaluators , is a companion to the 
Handbook ^frtd^is designed primarily for prevention program managers. 

- \ - . _ .1 l . . -0 _ _ |_ I* 

Although it is^wrfcfciit with the assumption that you— as 4 program manag^r^will h*ve direct access tp 
evaluation consultants through the Nf>ERN network, "the monograph will also be useful to managers in 
working' with evaluators generally. Indeed, it can help you tb .understand, _^eiSgn,jpd conduct your own 
program evaluations even if you have no outside assistance and»experti^e to help accomplish this. i 

• - m * - '..____*___*_'" 

As a user of the monograph, you are .e^ throigh aLleast once from 

^beginning to end. J£eh prevfetion program^ manager wUl bring different sets of experience, interest^ jgg ■ > 
^need to t^is monograph, and you will each find different chapters or sections to meet your interest, ^wme . 
. redundancy is built in ffbm chapter to chapter to maintain continuity, but the monograph as a whole is . 
shaped byjjjte following Structure: _ - 

^ d^fiapter 2,*^ Model for Program Change, introduces a Conceptual framework for evalua^^^ 
luni^-stejx continual .process [of pr^lm^lMning^ Evaluation^ of program process,' 

outeome,jand impact is introduced, along with ways to. categorize information and target areas. Chapter 2 
lays the groundwork for more detailed* discussion of thfe process and content l ^ g^uatiw in later chapter^^ 
' R should be reviewed by every monograph user and is must reading for"' program managers with little 
evaluation background ^' . * # ; ; 
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jShapter 3 A Program 'fcjsu^ in Preyentioh Evaluatibh/shifts focus to highljght so me char act eristics of 

art i* ^ ru ? abuse pr$? '<$t\ it ion and i t s s pf ogra rra J ji_r&a tjon to the evalua t ion _mpdel_ of chapter 2. It 
presents four major questions that prevention program managers must ksk to participate effectively in 
evaluation/ * r - 

. -_ . -■ -( , --.-21 ILL i_ : ' 

Chapter 4, Evaluation Issues in Prevention Programs, puts the program manager inside the evaluator's 
.h«;ad, to understand basic design and methodology questions that must be considered in condueting any 
6yalyati dn « _ i^t AohstrucUng Snd critiquing one case study of a poor evalUatjoh, chapter 4« Highlights 
technical issuesTh^t managers^and evaluators must examine together to assure useful evaluation. Chapter 4 
also describes and I com merits* extensively on i basic quantitative and qualitative methods and provides an 
intro^ focus is oh the contort mbije than the process of 

evaluation and may be useful as a continual reference for prevention program managers. 

_ Evaluation, elaborates the 9~step model introduced in chapter 2. It takes 

program managers through each step in detail, emphasizing their Pesponsibijity^ and participatioh with the 
evaluator. Chapter 5 can be read and used as a checklist for good evaluation process. 

Chapter 6, 'Case Studies jn Prevention Evaluation ti^ didactic, discussion of 

evaluation content and process into three^ease studies. Emphasizing real-life process, the case studies focus 
Qh _co|nmuriicatibri Between program decisionmakers and evajuators^ arid the relationships -among theatf' 

■^J^ ersonal cow munleat lofts; program' realities, and evaluation heeds that . encourage or hinder useful 

* evaluation. 

ir 

evaluation. 

_■___« : d 

Overall, this monograph discusses evaluation as participation, empowerment, learning, &nd survival. 
?Fhese themes flow from the experience and understanding of evaluation shared by the authors. 

Participation isjfundamental. Starting in chapter 2 which describes evaluation as part of a process of 
continual program cnange,;:the heed for program managers and evaluaidrs to collaborate is emphasized. This 
is not simply a Matter of good personal relations but follows from the nature of evaluation itself. 

Fundamentally, evaluation is a wayjb describe Selectively and then to jrtdge the value of something-^in 
this case your prevention 'prpgra'hi. The* political and orgahizatibhal histojry of evaluation reinforces ah 
ideology— and a reality— that this proce^ of description and judgment is "scientific," carried out by everts 
qh less expert, people and programs. * • « ■ - 



hapter, 7, Politics l and Science in Preveh also Uses case material but focuses oh the, 

Stance of the program^ external political context for the success or failure of both the program and its 



t . This monograph aifirms that science and expertise are indeednnvolved in*he evaluation of prevention 
^ programs. Bui it affirms something more— thai evaluation is not simply "objective" science composed/of 
facts outside ybUr 6wri interest and influence. 06bd (and bad) evaluation, like good {and bad) science* is 
-iw^mentally a human activity of th^jpeople l vvho do it. 

vfThAt includes you as a prevention program decisionmaker. As manager, your primary responsibility is 
? ^_<|i|^u^^ but the ends and means, goals and methods, of your program. Evaluation 

js-xafV extension of this same responsibility at a second level. To the extent that you cwtnbute to defining 
life goals : and methods of, an evaluation, you wifi influence, if not control, its prricess and outcome. 
I^rticipate! * '/ - * : . * 



^ This monograph is al&o about empowerment— yours; One intention is to provide you as a prevention 
~ program manager with enough of the "stuff" of evaluation,' its values, language, and technique, that you can 
RS rt * c AP* te \ intel ^g?D^* _.Mfect_iyeiy__witri _,eyaluators^ arid other L-^cisiprimaRKers in _ ttie conduct of 
evaluation. YThe jnohograph won't turn you info a fuH-ti me evaluator. It can help you become a better 
contributor to arid user of your own program evaluation, and thereby ah even better manager. 

Although technical aspects of evaluation are discussed throughout the monograph, chapter 4 contains 
the rnpst concentrated discussion. As^ou delve into this* remember another fundamental characteristic: 
evaluation is about , the certainty arid u^ei^nty of^ what j>e_ople l JSPw arid can jkribw . about tfte world, ■ 
including prevention programs. Evaluation: is about reducing the uncertainty of what we know. All the more 
technical aspects of evaluation, including .the mo5t abstract, complex; and specialized scientific or 
J®! 1 ^! J?tc fundamentally jabbljt. identifying different kinds of uncertainty and reducing it. 
Evaluation is also about understanding that any approach to reducing uncertainty in the real world has 
accompanying .costs. Keep this principle in tpihd as you use. the monog^a^h to increase your own knowledge 
* find power, M * ; . - /* ^ 1 * 



Empowerment comes not only from learning the content^f evaiuatiom but from participating in the 
evaluatfon process. In chapter 5, the authors take you step-by-step through, the process of preparing for tin 
evaluation and working with an evaluatbr. Both this chapter and the chapter B dSse studies try to; capture 
the feel for assertive, intelligent give-and-take between program manager and eyaluator that is the 
hallmark of good evaluation process; 

• Participation and empowerment are twin aspects of your process as a manager in program evaluation. 
Learning and survival are likewise twins-but they are the goals. . Evaluation contains a natural tension 
between acting in the world based on current belief and knowledge and remaining open to new experience 
and knowledge that may change belief.and action in the future. This is the tension between growth and 
change, continuity and status quo. It is a tension and balance that affects each program, the field ofalcohol 
and drug abuse prevention in general, the larger, society, and the political economy. 

Chapter 3 explores some of the prevention program issues, including .changes in the preventjon field 
. itself, that contribute to the change/survival dynamic. Chapter 7 likewise focuses explicitly on the survival 
value of program evaluation, emphasizing how recent political and economic changes have shifted- the human 
service emphasis from i learning, change, and growth to a more survival orientatibh. 

How does your prevention program fare in the midst of these changes? What criteria are your funding 
decisionmakers using to divide a probably shrinking pie? Assuming you're still in the pie,- what criteria and 
information are you using for your own program and budget decisions? This,, too, is the stuff of evaluation. 
You may even find that asking these questions— challenging your own and your program's actions and I beliefs- 
-can become as interesting as the actions and beliefs themselves. >To incorporate the questioning 
"evaluator" perspective may contribute to your becoming a more committed doer and manager. 

Read foe monograph through, pick and choose what interests you most, read and use again. We hope 
you find the monograph as useful as we found it fun.to create. Try it! 
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CHAPTER 2: A MODEL FOR PROGRAM CHANGE 

(It Goes Round and Round and Never Stops) 



impact process - outcome - feedback 

target group goals objectives 

cost effectiveness utilization 
a red herring * program development 

indicators research design 

validity reliability ; random sampling 

The_ above ^ terms, among others, -appear numerous times throughout thiS~mondgraph. Such is the 
language of the prevention evaluation field. If evaluation is used not simply as pruning shears for funding 
Menkes, but as a means of aiding program development and renewal, then most of the termihblqgS can 
become part of the everyday program vernacular.. It would be best for program decisionmakers and 
evaluators to speak the same language. ..^^ 

6 _ j_ ' ■■_>_ . ijgS^L" - \ 

The purpose of this chapter is to describe an evaluation model based on constant feedback about various 
aspects of a program to promote continual program development. But first, we must discuss the need for a 
5l^l*lf__?0?_ following the model description, how to tie the model to various phases in a program's 
development. 



NEED FOR AN EVALUATION MODEL 



Funding for health and human services is always tight but has become more 8b in the recent past. Drug 
and alcohol abuse prevention, as a new kid on th^ block of hurt an services, especially needs to prove its 
worth to various sources of pressure and funds. Taxpayers, Government agencies, foundations, and others all 
seek more effective evaluation of programs in the human services field. "More effective" implies that past ' 
evaluations have been lacking in effectiveness. This is a justifiable implication, but the need, today is to 
build from past problems rather than to tear down past evaluations. The potential for evaluation research is 
enormous- The field itself has contributors corning from the many wientiflc disciplines involved in the 
evaluation of human services— psychology, sociology, anthropology, political science, statistics, operations 
research, systems analysis, economics, and computer science. The evaluation of any one human services 
program (for example, a substance abuse 'prevention program) can draw on the growing literature from all 
these fields. ' 

One of the most often criticized aspects of evaluations is the underutilization of the results, pile of 
the reasons is that the questions of the decisionmakers who could best use the results are not always 
considered during the early phases of the evaluation. If decisionmakers are hot asked what information they 
need, the evaluation may 'not even address the appropriate issues. The audience of decisionmakers we rater 
to could range from funding sources or key program administrators to program staff or community activists. 
For example, if a funding agency wants a strict cost-efficiendy analysis', the program manager's interest in 
which aspects of the program actually help the participants the most— regardless of cost— can go unnoticed 
and unexamined. Conversely, if an evaluation is lifted to an internal investigation of the success of 
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differeni approaches to prevention with no interest in economic realities, the projects administrators may 
have difficulty in providing the type of information (for example* bbttbrri-lirie costs) that some funders 
demand; 

Because the importance of using evaluation results cannot be overemphasized^ it is rejseatedly stressed 
throughout this monograph- If the results of ar^ evaluation are ignored, or never reach the critical 
decisionmakers, the evaluation plan was not; weti thdught out or implemented. In the case of the evaluation 
model presented here, utilization of results will be seen as a. basis for both program survival and 
improvement. ^ 

A program manager is hot expected to keep abreast of deyelbpmerits arid techhiq 
field, of course._ This is why there are evaluation consult^ of \he 

appropriate techniques and applications for various methodologies should be taken for .granted, but oneypf 
the problems in the past has been methodological deficiencies. An evaluatpr has to be flexible, wUlirig* |(g 
able to divorce himself from hi| favorite meth^ JBut Jid% does a 

manager know whether or not the e valuator is suggesting an appropriate method? By being m critical 
consumer of evaluation servicesl A good manager will demand to be informed of the potential uses aria 
limitations of alternative desigris for thf prp^am i eyaluatjon. A good manager needs to l^ow the costs 
(financial arid informational) of one method compared to another. Even if the manager has no control over 
the conduct of ^in evaluation— as in the case of a funding agency hiring ah outside evaluator with carte 
blanche to find but only what the agericy wants to know— the manager has the right to know what is being 
looked [at and how it is being done^ Ideally, a good evaluator-manager team wiH develop, *poolirqj their 
knowledge of the theoretical, the applied, the ideal, and the practical aspects of both prevention programs 
and their evaluations. This monograph and the previously published Haridbbok for Prevention Evaluation 
(French and Kaufman 1981) encourage team effort. 

To build cooperation, a cbrisisTeht frarne of reference arid language is ne^ed f^ dwiBionmakers and 
evaluators. The jsvaluation research model developed several years ago under the auspices of the National 
Institute on Drug Abusers (NIDA's) Prevention Branch (Bukoski 1979j French arid Kaufman 1981), building on 
work by Waller and Scarilori (1973) arid others, provides the context for the evaluation issues, strategies, and 
methodologies presented here. 

• ■ * 

No .rigid* standard form of evaluating preveritibri pro-ams is ^ 
* preserited to ericoUrage the incoiporatjbri 6f new developments jn both prevternion programing and ©valuation 
methods. This framework provides a rational approach to program evaluation and shows how evaluation 
methods can be incorporated iritb a program in a mariner most helpful to the prevention program itself. 



THE EVALUATION MODEL 



This model can be used with any alcohol or drug abuse prevention approach. It features three levels of 

evaluation: _. ' * 

t ; process, outcome, and impact 

categorizes information into three types: 

descriptive* comparative, arid explanatory 

and can focus on one or more of four major target areas: 

individual, prograrif, service system, and societal. 

these three evaluatibn pafameters^level, information type, arid target area ---are discussed below. 1 



Level • ~~ . 

Each level of evaluation (process, outcome, impact) has its own set of indicators and methodologies. 
Wiys to measure what ia going on— methbd6lbgies--differ among the three levels* as dp the things that are 
measured— iridica tors. The three levels are discussed below, with" a brief overview of all three followed by a 
more thorough discussion of each; 
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Process evaluation isj* thorough- description of the various aspects, of a prevention program : It 
attempts to present a^tSJmplete pietiire-^the dynamics and characteristic^ 
prevention program. Process evalu^at ion examine^ the _ target pop^ 

program^ the services delivered, and the utilization o£ resources, for program components. These and other 
aspects of the program all provide fndicators/at this level of evaluation. > > 4 

OGtcome evaluation is what most people think of when evaluation is mentioned. It is concerned with 
measuring the effect of a program on the people participating in it. Outcome evaluation fittempts to answer 
the question: "Ha| the program had a sjgnifiM^ the desired 

direction?" In essence, this level of evaluatidh is an attempt to determine if the program has met its 
objectives in producing changes in perceptions, attitudes, behaviors* or other effectiveness indicators among 
its targeted client group. ' jf? * . ^ 

Impact evaluation examines the total effect of prevention propams pri the community as a whole. The 
key word here is community* which may be defined as i school, :nejghtx>r hood^ town, city, »_Stat^_ etc. 
PA m ^_y n _ity"Wide ijndicators such as Jncidence and prevalence of substance abuse, related criminal* activity, 
and institutional/societal policy and change are measured through methods suchas epidemiologic studies or 
community surveys. The attempt is made to gauge the impact of a prbgrarr i operating over an extended 
period of time or of several programs operating within a specified geographic area. ^ V 

• _ _> 

The three levels of evaluation are not mutually exclusive. Rather ,_they can be viewed as successive 
phases in the development of information in a comprehensive evaluation effort. 

Process evaluation*— The information gathered during this evaluative phase reflects all of the inputs 
into a program j the patterns in which these inputs interact* and the various transactions and interactions 
that take place within a program. Important process information includes the theory on which the program 
operates, needs assessment, policy development* program design* and the characteristics of .program clients, 
staff* physical plant, decisionmaking str^cjurej and finM of data i can provide 

continuous feedback to use for internal monitoring which can help guide and dire'ct resource allocation* 
organizational decisions, and ongoing program development.^ 



Process information can also contribute^ repticabiiity outside of, or external to, 

r the program. How can process information from different programs be compared? One cannot simply 
compare programs without considering their operating contexts. These contexts are; themselves, part of the 
process information, ^ categorizing this information into four general a^as-^uman resources, physical 
resource variables, contextual variables, and program specific variables— it becomes easier to identify 
variations between or among programs. 

Human resources include all client and staff variables^ affecting the i program. The number and 
description of clients served^ staff ing patterns* qualifications of staff* arid attitudes and behaviors of both 
clients arid staff are all considered human resources of a program. 

Physical resource variables include descriptions of the physical plant, j^uipmeritj and materials and the 
' program functions and activities which utilize these resources. Financial resources and expenditures are 
important program inputs which also provide a basis for cost analysis. 

Com extual variables describe the com mumty and institutional environments in which a prevention^ 
program operates. These directly affect the workings and effectiveness of the program,. The demographic 
and socioeconomic. makeup of the community are important factors* as are community attitudes and rates of 
various social problems (e.g., arrests and substance abuse related medical episodes). 

Program -specific variables cm be roughly divided into organizational structure, program service 
delivery, and participant/staff/pro^ranrrHnteractions. 

- ^ - - ____ 

Organizational Structure .— An analysis of an organization can yield important information regarding 
lines of authority com mUniMtibn and Mc_isipnm_^ instance, 
th_ er <5_ may be important differences between a freestanding prevention program &nd one that is part of a 
larger organization. Over time, most facets of an organization can be expected to chahge**ahd a description 
of the evolution of the current structure— and plans, if any, for future change— is very important. 



Program Service Delivery.— Information regarding program service delivery includes the heeds being 
addressed, the asiumptibns/tKebries _ Underlying the partjeutetr L prevention strategy, and actual program 
practices; the last involves the structure of delivery as wel^as content, js Ha sequence of presentations or - 
sessions or is it a bhe-time delivery? Are the sessions schedule^ 5p advance or given on demand? Are the V 

O 6 ^ 
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timing and structure of delivery the same as that originally planned? The seryices_actuflll^ 
ib be looked at in relation to the program's theoretical basejn two ways: : first, -does the proigram actually 
carry out the J^jihM_pi > e^nt^_^raj^yi second; do the services respond *tb the assessed needs? As 
discussed later 'in this chapter, ttie, actual program delivery may deviate from the intended delivery at 
several phases in a program's. development. 

¥_ ■ _ . ■ 

Participant/staff/program 'interactions >--Participant/program interactions include referral or selection 
procedures, client expectations, and the time arid quality of participation. Regardless of modality and 
program, some identification or referral of clients is 'needed. It could be a formal xeferral network or 
simply membership in a group identified "at risk"— for example, junior high school students in a particular 
school district. Similarly, all' participants have expectations regarding the program and its potential effects 
on them. These expectations influence the degree of quality of participation in the program. Someone with 
less motivation L Woulrijiqt be expected to invest as much energy as someone who wants to gain as much as 
possible from the program. 

Participant/staff relationships involve both the frequency and duration of interactions as well as the 
quality of contact between clients and staff members. Counts can be obtained arid examined 1 restively 
easily; qualitative assessments are more difficult. Client arid [staff percej)tions of tn where, how, 

arid why" of the interactions are important, as is the comparison between these perceptions. 

Staff-staff and staff^rbgrarTi relationships can be examiried to see how^ 
arid share common goals. Absenteeism arid turnover rates can highlight problems. Also of importance is the 
congruence between intended and actual staff roles as well as the staffs expectations for both the overall 
program and individual roles within it. 

To summarize: _. 1 

process evaluation is a fancy way of ^ 
answering the question 
"What's going on?" : 

- in a new pro'gram, process l eyajuatlori^ 

is the only way to know what's going on, 

arid in any program, process ^valuation 
tells you if jvhat's going on is 
what you wanted to go on. 

Outcome e^alua^i6ii._--In^^ gathered during this phase usually addresses specific program 

objectives concerned with changing pafficipajnts' behavior, attitudes, values* or knowledge.,. The ultimate 
goal of all prevention programs is the reduction of drug and/or alcohol abuse. However, depending on the 
theory underlying the program, a more immediate objective may be something likeJUficrease self-value" or 
"improve social skills." These objectives are theorized to be associated with decreased substance abuse. In 
other words, the program attempts to reduce tne risk inherent in spme_ state sueh^as low self-esteem, poor 
school performance, or maybe simply ignorance about drugs and alcohol, thereby decreasing future 
substance abuse. / 

To assess whether program objectives have been _meti_ they must first be identified. This is not always 
as easy as it sourids. Using^ process evaluation, both intermediate and ultimate (objectives can be identified 
by examining the development of the program. Even if a full-scale process evaluation ^ 
some process information must be collected to fderitif^ the program^ objectives. What was the problem or 
need leading to the program's initiation? yHow does the program purport to alleviate the problem and meet 
the need? What effect does the 4 pro^aryi hope to hpve on its participants? Will it change attitudes or 
change behavior iri a more immediate way? Does it attempt to jilarify values or mcrease knowledge of 
risks? How long must clients participate in order to benefit from the program? How long are program 
effects expected to be sustained? J 

Many program managers may find such questions simple and the answers, clear. These managers will 
also have a good understanding and clfear statement of program objectives. However, some managers will 
not know their programs' objectives /immediately. And the objectives of some programs are not easily 
specified. Thus one bejiefit of ah evaluation may be the learning process undertaken to articulate the 
objectives of the program; / 

Most programs have multiple objectives, All of which need to be identified. Different interested 
parties, whether staff, participants,Junding sources, or others, may emphasise certain objectives more than 
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Others/ All these factors need to be_ considered when objectives are listed; If some important objective is 
* omitted, an outcome evaluation may fail to detect a significant contribution of the program. 

Depending on the program; the intermediate objectives may be to produce changes in one or more of 

the following areas: ' 

Attitudes • ' intended ^bture use 

Perronjrtjldii^ Interactions* with family and peers 

Knowledge about drugs /alcohol School performance __ ± * 

Criminal activity . Sociai-recreationai activities. mj 

This list is not exhaustive, and some managers^ may Jmjnediatel^ idertti^ other area^ where their program 
seeks change. The program manager needs to make sure that all relevant objectives are identified before an 
outcome; evaluation is actually conducted. J *' 



T6*summarize: ■ . : 

-- . - - . - ' 

outcome evaluation, tells ySL whether 
^ what's going on i^s- 

changes the participants. 

Impact evaluation.— -Information produced at this level of ev^uation js broader in scope thah process or 
outcome information. There are* however, parallels between outcome and impact evaluation. An outcome 
evaluation measures changes in program participants, whereas an impact evaluation measures changes in the 
entire population for whom generalized effects are expected. The identification arid estimation of impact 
are particularly important in evaluating prevention activities^: For example, the results of an impact 
eyj*luatjon ca^ expansion. The results of an impact study on an entire 

high school population where only some students participated jn a prevention program could aid in expanding 
the program to reach even more students, perhaps in other schools, j 

Generalized effects of a program occur throughout the community— however defined— and across 
prevention programs within a community, tht$ these effects are often measured in aggregateor cumulative 
form such as incidehce/prevalehce levels, rates of drug_or alcohol arrests^and hospitalizations. A decrease 
in substance ? buse *^e pojnmur^ For instance, an improved school 

environment and lower maintenance costs may result from reduced substance abuse. Of coursej one task of 
the impact evaluation is to determine how much of the overall improvement is attributable to the 
prevention activities operating within the community. 

Before program impact can actually be assessed* some important barriers that limit the extension of 

program outcome must be carefully .considered; For exajngle^ Jf a program 4s aimed at a very limited 
subgroup (by age* race^ ethnicity, geography, etc.) of a high rflric population^ then the magnitude of any 
measured impact oh the entire population might be quite smaU.Vother factors ;to_ be considered for an 
^P^lA^^y™ 1 . ifl ? lu ^e a' ^definition of community related to a program's size and impact* intended and 
unintended effects, and delay and durability of effect. 

Definition of , community . — The probability of a prevention pro-am xtechihg members of a target group 
is obviously related to the size of both the program and' the group. jni\definitib_n of community should 
relate to the scope and objectives of a program and be limited to an arealVwhich detectable impacts may 
-^^^.^h?.^* 8 ?- 0 /--?- Pl^^^J^-yinlted _t© pne__e!aaB_ within one school; we impact of the program will 
probably be limited to families of the students involved^ some trf their peer^ahd perhaps their neighbors. 
The definition of community should be so iiinited. Compare that to the case of a television show where the 
P otential :l™P*?t; and thus the community, are limited only by the scope of the broadcast (local, regional, or 
national broadcast). ► 

Intended and unintended effects . — By definition, intended effects of a program are always positive. 
They are, after all, based on program objectives. Unintended effects may be either positive or negative. 
For instance, a program aimed at decreasing one type of substance abuse— alcohol— may increase a different 
type— cigarette smoking. Though these effects expected, knowledge of them may help in modifying 

the program— for example, adding a lung cancer film to the film on alcohol related brain damage! 

Delay and durability of effect.— If ah impact evaluation is implemented too soon after a program is 
initiated, no impact may be found. Obviously there may J5eji delay before any generalized effects are 
measurable. To assess the durability of the impact of a program; timing is again important. If possible, a 
followup study would indicate the lengtfi,bf time that the overall impact of a program can be sustained. 

> i7 : 
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_ the issues of intended and unintended effects* as well as of delay and durabi^ty of effect are as 
^important for outcome evaluation as they are for impact evaluation. Or, evaluations must consider 

what happened, expected or not* 
i * . how lbhg.it took'to happen^ 

how long it did (or would) last. 



All of these factors need to be taken into account by the program manager and evaluator. Developing a 
rational plan at the impact level may be more involved and more costly, but the knowledge gained can be 
significant. , , 

. To summarize:^ 



irrtpact evaluation shows whether 
^ what's going on changes the larger community. 

--_ --\ -"- '_■ - * - - - 

Finally, looking at evaluation as a whole: 

each eyaliiaticji level can lead you . ~ 
_ throi5gh_feedbaeic loops 
to program improvement, or 
to put it graphically, 




* Process 



Outcome — t ^ Impact 



uvunug j 



Figure 2-1 illustrates 11 list developed by NIDA of major indicators ahd)approach^ \ tor th^ Uu^ee levels 
of evaluation. Note that fc>roc£ss and outcome evaluation focus on effects within the program, whereas 
impact evaluation focuses ofreffects at the comm Relevant to this model, various methodologies 

are discussed in chapter 4 of itiis volume and in the Handbook. 

\ ' 

Info rmation Type 

A second parameter of evaluation is the type of informaUon that can be generated. Three ^types can be 
identified: descriptive, comparative, and explanatory. DeSriptive information is the easiest and le^t 
expensive to obtain. As the name implies* this type of information describes the program, the clients, the 
staff, the environment, and so forth. Mubh of the prTOeMVleyel information obtained in describing a 
program is nece^^ily descriptive. Hence, it is important ^that the program records from which the 
information is drawn are adequate. A straightforward mmagement inform^ 

descriptive information can be started early in a program's development or can be the first step in an 
evaluation process. 

Comparative information involves variables thought to sigrnfican^ program functioning, but 
does not ^i|^>ausality. For example, staff attitudes concerning prevention can be compared to the 
program participants' attitudes toward prevention. Both sets of attitudes may affect program l functioning, 
but determining which set caused the other is the old chicken [and egg problem— which did come first? The 
cost of comparative information will be higher than that of descriptive information in, terms of time, effort, 
money, and design, but more complex issues can be examined. 

Explanatory information is used to try to answer even more complex. questions such as, why doggs the 
program work? If two groups of 12th grade students show different levels of substance abuse, ean^he 
difference be attributed to the prevention activities of qne gnwp? More importantly, what program 
components [are responsible fpr the effects? Obviously, gatherir^ and anajyzdng this tyfc of information 
requires even more sophistication in terms of design and theory testihg^s well as more i fi^ncial and other 
resources. But if the purpose and goals of the evaluation require it, the* effort expended is( worthwhile. 

In general, the type of information sought is a function of data availability (wha t da ta are already 
gathered and what can be obtained)* evaluation design (within the ^constraints of ^avf ilftbility, what does the 
manager want to know) and analytic technique (in what form does the evaluator want the data). A fuller 
explication of the process of choosing information type(s) is found in chapter 4. 
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Figure 2-1) Drug abuse prevention evaluative research model (Bukoski 1979) \ 



LEVEL OF EVALUATION 

*— ■•- 

r' ' 


PROCESS - - > 


OUTCOME — ► 


r> IMPACT 

* * 


' : i 

Focus of evaluation 


Prevention pro] 


(ram effects \ 

r 


Aggregate or cumulative 
effects at the community 
level * 


Potential indicator!, 
of effectiveness 


Description of target 

audience/recipients of 

service • • , • 
Prevention'services ! 

delivered _ ' 
Staff activities planned/ 

performed 

Financing resources utilized 

. • * -j 

f, 


* 

Changes in drug-related: 
Perceptions 

Attitudes k 1 

Knowl|dge 

Actions: 

' Drug use 

TYuancy ' 

School achievement 

Involvement in 1 

community activities 


• .< . _ f 

Changes in: 

Prevalence and incidence 
of drug use 

Driur-related mortality/ 
morbidity 

Institutional policy/ 
prdgrams . ^ 
Youth/parent involvement 
in community 1 
Accident rates 


\ 

Potential prevention 
/ evaluative approaches 


Examples: 

Hie Cooper Model for Process 
Evaluation - 
N1DA-CONSAD Model 

t •• 

NIDA-Cost Accountability 

_ModeK J 
Quality assurance assessment 

'. '" ■, ''■ .*.*'* ■ ■ . 

* 


Examples: 

Experimental paradigms. 

Quasi-experimental designs 

Ipsative designs e.g M Goal ■ 
Attainment Scaling 

/ . ' ■ • - . ' 1 * 


< *- 

Examples: 

Epidemiologic studies 

incidence and prevalence 
studies 

Drug-related school surveys 
Cost+benef it analysis 



4 .» 



4 



ERIC 



Target Area \ r ' / 

'a third fa^et of tte evaluation process, is the target or focus of th<? program and hence the ^ foe^ of the 
evaluation. For exaSple, are changes in individuals over time being sought? Are community ihoWever 
defined) or societal changes in attitudes/behaviors of interest? Depending on where the center of interest 
lies, different questions can be asked of different people, the evaluative focus is Usually one of the 
following targets—individual, program, service system (comprising- several programs), or societal. The 
choice will depend on fhe needs arid resources of the decisionmakers involved in_the evaluation proc&s. Fpr 
example, a school' board in an urban area may want to evaluate various prevention projects ^ throjighout the . 
school district as a whole, or one principal may want to find but if a specific group activity is succeeding in 
its prevention activities. These two situations will result in different types of' evaluation activity, with 
morg emphasis placed on communfcy-wide impact evaluation in the first case tha_n in the second. However, 
an evaluation focused on one target area can still have a'n effefct on others. For instance, an evaluation 
concerning a group of students in one prevention project could contribute to a better understanding of the 
overall service system of which that program is a part. . » - . 0 

The three parameters— level, information type, arid target area— arid their relationships are graphically 
displayed in figure 2-2. j 
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Level of Evaluation j 
Figure 2-2. Evaluation CbrisideratY oris (French ari d Kaufman 1981) 
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DEVELOPMENT OF AN EVALUABLE PROGRAM 

* _ . ) - 

_ Every program is evaluabie^-some information is always available to indicate what's going on. A major 
objective of program evaluation is to use this information base for decisionmaking^ Continual program 
improvement is contingent on feedback to the manager pnd other staff regarding program development and 
implementation. With this in mind, evaluation should become ah mtegrffl pari , of ongoing program 
development, supplying appropriate feedback to decisionmakers. Certain issues of program relevance, 
program quality; etc., can be examined at different phases of program development and operation; the 
information obtained can provide a foundation from which criteria Tor further development and management 
decisions can be established. / • ' 

The greatest power of evaluation will' be realized if evaluation has a" role from 'the first stages\f 
program v development. For* example, a process evaluation documenting, the earliest phases of program 
^development pan provide information that would otherwise be unavailable. However, regardless of when the 
evaluation takes place, feedback can enhance thjfe-xshahces of further growth and improved program effects. 

> Five major phases of program development were delineated in the' Handbook for Prevention Evaluation: 
Th^-saoie distinctions are presented hens, emphasizing the information needs of the manager and questions 
appropriate for each £hase. The phfitses are; 4 • < 
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o needs assessment * a 

6 policy development . 

i o program design 

o program in^i^^pn 

o program operation. : ■ 

The discussion below looks at the first three stages as planning phases and the last two as 
implementation phases; _ * 



Planning Phases . , • 

Neecfe aases^enU— The initial phase of program development, is establishing^ivhether and to what 
extent a certain problem exists within a given subgroup in the community, Tor example, is there a growing 
substance abuse problem among a high school's student body? Once this information is obtained, a specific 
cause of the problem is postulated' leading to the definition of a need for a specific process to overcome the 
problem. For example, if the problem is caused by a lack of organized activities involving high school 
students, then an alternatives progranj for high school youth would be proposed as a means of ameliorating . 
the situation. __If the problem is inaccurately measured, or the causal assumption is wrong, then the program 
may eventually be found ineffective. The manager .needs to have accurate information to confirm that the 
program is based upon the correct assumptions concerning the problem while the prevention program is still 
in the planning stage rather than when the program is in fuH operation. 

The ideal~proble*n. assessment leads to the"d£finition of need. 
The frequent reality— the: proble m assessment is used to justify what * 
somebody already believes. . • * 

__ _ <r _ ■ * 

Policy development.— During ihe second phase, the goals and specific objectives of the program are 
defined, basfed on the theory postulated in the previous phase. Many different factors, not all of which are 
internal to the program, need to be taken into account at this point. _ Financial resources, values, attitudes, _ 
and concerns of various individuals. (policymakers at the levels of^program, local government, State andf 
Federal government, program staff, and potential program participants) need to be identified and their! 
impact on program policy assessed. Depending on the specific problem, goals and objectives may have to be*' 
limited in a realistic sense to fit the sociopolitical environment. - Given the eontext^r these variables, the v 
manager wiU want an accurate translation of the theory into policy. A clear understanding of the factors 
involved —whether thtey would support or imped e the program T s development— is needed to ensurfr a^rati" 



policy development. 



The ideal— goals and objectives flow from prWiously formulated theory. 
The frequent reality— programs can operate foqyears without 
formulating anything but the mpst obvious sfoals. 
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Program design.— The final planning stage transform/ the program 
^aracteristies: Specific program components and activities mast be developed in relation to overall policy, 
Tljis is the operationalization of the policy, where the prograjro decisionmaker heeds to know ^hat has been ' 
done previously to meet similar objectives.: How can Jhe saffie thing be accomplished now, given existing 
resources, program capacity, staff size, faoility limitations, staff background dad qualifications,, and 
com muni ty characteristics? AH of these factors need to be taken into account in order to produce a fully 
derailed program design. - : _ f ' 

: ^ the ick^— program components and activWies are rationally justified by goals arid objectives. 

The frequent reality— -trial and error. . 

The introduction of an evaluation at any of these planning stages can increase the amount a|id quality of 
feedback. To bring, the reality closer to the ideal, the l evaluation [ should tfo more than just assess the^ 
attainment of specific objectives. % statejj objective^ are not reached, information concerning stages of 
development before program operation becomes critical, earlier stages an evaluator can ask questions . 
that .would also be of interest to the program manager. For instance, at the needs a^e^ment stage, the 
assessment of tp problem can be examined. If the objectives \ 6f the program are met, but the problem does 
riot really exist, should the ^program be labeled a success? Or maybe the assumptions regarding the cause of 
the problem or the definition of the need are erroneous. In that case,. the o^Je^ctiyes rnay riot be i met : in even 
a smoothly operating progtam becaljse the policy developed and implemented may have no bearing on the 
problem^ 

The foundations of process-level information are found in all three of these planning phases. Evaluation 
at this time cari provide information on the flow from 



:. proble<ri =* need theory policy goals =^ objectives design > 

Information needed for process evaluation may be available later while the program isjn operation, but it' 
would probably be of more immediate help to the manager if available during these planning stages. 
Information wouldjalso tend to be available more efficiently with less cdst in terms of time, effort, apd 
money before program implementation. 



Implementation Pha s e s 

Program iriitiatimu— At this stage, the program is established itnd implemented; translation otjtheory 

into action takes place. The manager can now see if the implementation matches the program design. That 

is, information on participants, reteurces, arid constraints can be cpmpared with ^ 

design. This stage can also be viewed as a debugging phase where problems in implementation are corrected 

anS the program is set up for smooth operations. Is the program operating as designed? Are^staff 

assignments recbgnized, accepted, and carried out? Are the participants receiving the types of services 

'planned? ■ A 

• _ _ ' ■ i_ _.- ■ 

The ideal— bugs are* recognized arid corrected. • 

The frequent reality—the bugs survive. * 

? program operations.— Once the program is fully operational, it does not simply run by itself. Godd % 
management and direction are needed to keep the program functioning- and improving. In addition, a 
program does ; not operate in a vacuum. Continual upgrading and development of the program must include 
rrtechanisms for adapting to changing needs arid ^problems in the client pc^ulatfcri arid community. Some 
changes may be the result of the prevention program, as measured by outcome and impact evaluation. 
Others may be due to some external forces, such as local, State, or Federal political decisions, changing 
levels of community involvement, or changing supports arid constraints of fuqdirig sources. 



Theideal^peratirig programs continually increase their ability to jneet objectives. 
The frequent reality— maintenance of the status quo or irrational charige. ^ 



Ngrie of these jpha^s riecessarily represent discreet, mutually exclusive periods of time. Program 
development is a dynamic process, with constant feedback and improvement. Different ^aspects of a 
program can b$ in different stages of development at the same time. As needs of the community change, so' 
too mUst the program, evolve. Evaluation is one tool that can be used to aid in that development. The model 
presented in this chapter -is one method of ensuring a rational approach to both the evaluation arid 
development of the program. ; - : Y 
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CHAPTER 3: PROGRAM ISSUES IN PREVENTION EVALUATION 

- •• ■ : .1 z . •■; 

(What Managers Need to Know or Remind Themselves About) 

■ V/ * 

_ scene is the director's office of a local prevention program. Scattered- across the desk are all the 

signs of a late^ven^ titles tell most of the 

story. Ah evaluation is being considered, and dusty volumes of college textbo^to^n [statistics and research 
methods are being frantically reviewed for long-forgotten definitions: chi squares, t-tests, and type ti 
errors /_ ^ 6 she can give to the evaluation when 

she d^jph't even remember what a quasi-experimental design looks like. 

The director's predicament is not uncommon. Most conscientious program decisionmakers are awjirf 
that they have a role to play in the evaluation process. Some have watched evaluation studies take place 
within their own Lpro^ams or hayO_egyn_ to ^ explore the Hterat^ evaluatiofiw^h 
too little has been written on the specific role of the program manager, ' r * 

^me^rogram profess^ above* try to become c<^versarit_ enough with research 

terminology to at least participate in^ planning jit some level. Othere, who have little or no background in 
evaluation research may fail to see the importance of their involvement and turn the entire task over to an 
.evaluation consultant. * 

Undoubtedly, the manager needs to know enough about evaluation to ask- critical qdnstions concerning 
; the method^ being ^Used. Other sections of this 'monograph address concerns^abodt evaluation models and 
measurement. The focus of this chapter, however, is on program knoWledge rather than evaluation > 
knowledge. Amid the work and anxiety of an evaluation project, the program decisionmaker frequently loses 
sight of the fact. that: * . . 

The most significant contribution program managers make to development of the evaluation lied in what 
they know about the program rather than what they know about the evaluation process. 

^Sj; appwelate the significance of this statement, it is "important to understand what makes an 
eyaiuatibn work. Weiss (1972^ p. 6) makes an important distinction Between research and "evaluation" 
research by noting that, jn^ to be considered are those 6t the program rattier 

than those of the researcher. Sooner or later the decisionmaker must consider these issues: 

o What do I need to know about the program? 

6; What decisions am I prepared to make? • # 

o How should the evaluation results be presented to^help make those decisions? 



Many elaborate evaluations have failed to yield valid or useful results because the ey^uatbr made 
inaccurate assumptions about the program itself or because the users of the evaluation findings had not been 
clearly identified. — 

Program information from the perspective of the decisionmaker is crucial to the* evaluation process. It 
represents a view ^_of the program the eyalUatbr does not have and provides a context for evaluation 
activities. Program considerations affect every aspect of the evaluation process, from _the_ select ion of 
questions to the choice of instruments to the use of results. They influence what kind of evaluator should be 
consulted and what kind of staff adjustments will be necessary to accommodate the evaluation. 
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In some ways it might seem presumptuous to devote a chapter to explaining prevention programs to 
program decisionmakers, After all, aren't most managers already familiar with the resources and services 
of their organization? » Yes and no. * They usually, have information about the program, but need to 
uriderstahW it from an eValuativ'e point of view. 

Decisionmakers usually keep at tKeirfiiigertij»..sacn. ^MSSjjj.^^Wg 
• of services, and an organizational chart. However, at times the manager needs ot^e. -kinds of •ntomation. 
For example, in long-Tange planning, questions must be asked about the^rogram's m«s,on, the consumers of 
its services, and i ? potential for change. In the same way, certain aspects of theiprogram need to be 
consTde^f n Jrepari^ for ^valuation. However, many program decisionmakers have not be^own ; 
S connections. Too often?evaluatipns are not geared to the needs and circumstances of the program, . 
«nd the' program staffs questions are never incorporated into the design. There is always the risk that the 
program wiU serve the evaluation rather than be serviced by it. 

Asking important program questions at the beginning of the evaluation process helps to enrf-e that the 
results will be genuinely useful. False'starts due Xo misunderstandings or confusion are eliminated! and a >■ 
true" partnership, can develop between the evaluator and program personnel. 1 

bi ttnVchapter, program issues relating' to evaluation will be grouped into four areas and discuss**! tfm 
a manager's perspective: ••.«'.•" , • : 

o What is the program and whatis it meant toS£>? ' ' • 

b What are the evaluation qUesS^ns to be asked by the'program ? ♦. ' 

p What kind of evaluation win fit a particular program? _ • : » _ . - 

o Will the evaluation be worthwhile for the program? - . - . -.' 

Reflecting on a prevention program from this perspective is not oniy. helpful for the program ' 
decisionmaker! but- as Patton (1978) points but, equaUy valuable fpr funding - sources^ hne . 
consumers, Perceptionslabout program goals and services are not always sh ared among^e^ >™£wL >t 
different levels. An- evaluator may receive very different impressions of the same program "Jen it<* 
' described by an administrator, a staff member, of a client. As many program perspectives as possible should -;< 
be integrated for the' evaluation to be successful. • f ^ 

The program manager should be^nvolved throughout the evaluation process. Pro &* m ^J^* 
concerned wUh interpretation and utilization of findings are equally as significant; as those that take- place 
in early phases of a study. Most importantly 

\ • . , "' the decisionmaker's knowledge 

> of the needs, purposes, and goals of the program 

. is essential to evaluation. 



WHAT IS THE'PROGRAM AND WHAT IS IT MEANT TO DO,? . ' • . ■ 

• - 

This is the simplest of 'questions,' and oneU which every program manager has a- ready response. All 
programs have goals and objectives, even'ifHiey are implicit and unwritten. Yet, there niay^ not b^ an 
identifiable program to evaluate or .even agre&t : about .the : P^" 1, ^P"^^ ^afea^s c^ %^rk 
with this ambiguity. Many note that consultations with preventiofciwograms frequently begin by backm^up 
and reexamining program goals. \ . ** " t ' • ' - 

Evaluators encounter two common problems with program objectives. The first has to do with the B 
relationship between objectives and the program process and outcomes. A Pff^'W^^^ , 
beautifully written action plan that Hq longer describes the services currently provided- ^h^-^Wg 
was cut. Perhaps there was staff,turnover, Sr a particular project was changed slightly. . Maybe the program 
never did "reflect the stated objectives, which might have been written o rigin ally to sa ti s fy^ otmtm ii 
audience. Without objectives' that accurately describe the program's current intended outcomes, the 
evaluation' may proceed on a meaningless ^oburse. ; >- ;' . '" » 

The second problem is morefcomplex but no' less common. Many programs' st ated objecUves deserlb^ 
only program effort or proaej*. For example, a prevention program directed toward ^hoolchildren might 
include the following objective-deliver eight teacher-training sessions dqring the school yw. This 
objective is clear and measurable but describes e/ly«the process, not the. outcome of that activity. Such 
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. lvi "__ . .- . • _' ._ ; * — '*_ 

statements of program pr^ess are necessarf fOrthe evalUatof to understand the program's services, but 
|lo_ne do not linlf theacttvity to Jta_antaome. The evaluate* illay be unsure what outcomes to -examine. 
^ WOTSf, . the program may include an impressive array/ of prevention services without any clear 
UKhcatJOp of the specific results expected, v Both outcome and impact evaluation rely heavily On well- 
defined statements of what condition(s)fshouid exist _a*a resultrpf the program. In addition to a description 
<*f program process, objectives setting tbrth the Jri£eri£ed program outcomes are essential. 

Leaving aside ; the evaluator's use.o^outcome bbjectives/ their importance as a guide for the program 
decisionmaker is unquestioned. Stated another way, 'If you don't know where you're going, you rnay end up 
someplace else." A progrard may show all kinds of results, but it is difficult to judge success or failure 
without some objectives against which to measure those resjults. 

k ■ ■ : - ' : ■-- s~ -* - - 'U; ;/ l • ' 

« Goal setting is the first ; major task in prejtering for an evaluation, and one of the manager's 
responsibilities. Do you^havp clear arfd concise goalspiwJ bbjectives relating to program effort as well as to 
outcome? Can* your services be clearly identified irid defined? Is there agreement about the program's 
.intended results? Do you have a clear sense of, what represents success or failure? How much chanee is 
satisfactory?, ; • . 6 

Programs with articulated, measurable outcome objectives mike both daily management and evaluation 
; .design much easier. Valuable time and resources that wbuld otherwise be spent on goal setting and program 
plartrtingyan instead be used to discuss specif icSvaluation methods. 

s _ Other aspects of Se program may also heip identify its structure and purpose to both the manager 
and the evaiuator. In thfe prevention field, for example, programs can be categorized in a number of general 
ways that LfieJp__t_o describe their goals as well as their strategies of service delivery. Although these 
program dimensions may not be specifically written down, they are no less important to decisionmakers in 
describing th£ program. 

Pre veh t ion/Heal t h Promotion 



Prevention programs employ riot only widely different strategics, but try to effect different goals. The 

most notable distinction, perhaps, is between programs intended specifically to prevent alcohol and drug 
problems and those with jnore general goals, such as health courses with substance abuse modules. Within an 
evaluation, recognizing these distinctions important; they help evaluators appreciate the kind of program 
results acceptable or of importance to decisionmakers. 

Indirect Service/Direct Service— 

C Many ^evention programs deal with intermediary groups to promote change in a target jjroup. In such 
cases, program goals may be stated in terms of the_ everituat change desirqd in the target groups For 
example, a school-base<j program may have as its goal the development of social competencies >afcnong 
elemental However* the program activities may fee directed toward the training of teachers and 

school administrators. In this case (as in similar activities like information distribution/ training, and 
consultation), the program manager must distinguish ultimate consumers from those directly affected by 
program activity*; 

.•■ ■ ■ 1 i . j ~^ 

Etiology of Abuse/Model of Prevention ■ i'p.k'Xt . ^ X ■ 

- i ppo ? ra - m s dittei in their perspective on the causes and prevent ipn of alcohol and drug abuse. Some base 
their services on models of individual attitude _arid behavior change. ^Others approach the problem from a 
perspective of social standards or cultural norms. Implicit in every prevention program is a set of beliefs 
about wj^t causes people to develop problems and what preventive strategies are likely to be effective. 
Identifying these beliefs is extremely important Mn defining the kinds of results sought. Eor example* one- 
community adopted a prevention program designed to change norms regarding publfc intoxication. Although 
the community organisers used familiar strategies of awareness and community education, evaluators would 
have missed some of the program's substance had they looked only for measures of individual change. A 
clearly articulated program philosophy is essential in creating; an evaluation design, deciding what to 
measure, arid choosing measurement tools. 
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The program purpose, wrjttteri or uhwritte_h,_is the cbrh °i!_e p _ ^Y?^ 118 ^ 01 ! 9 ues **°*? s : 

rest^ TOe 4valuat^ , s role is to determine actual effects of program seJviceS. However, the role of the 
program decisionma&re begins with a clear statement of what they inteng to accomplish. 



JiyHAT ARE THE EVALUATION QUESTIONS TO BE ASKED /BY THlE PROGRAM? 



Once the program decisionmaker has. defined goals arid objectives/ of the program, ft is time' to ask 
similar quest fons about \the evaluation.' Evaluations hiust also have gdals and objectives. Evaluators and 
program admiriistratbr§ alike are often dismayed [ by _how few evaluation studies yield results useful for 
program direction. To be sure, part of the problem lies in conditions outside the evahia tor's control. 
Nonetheless, more often than not the program decisionmaker finds ythat the study" has failed to address 
essential Questions. - 

i ] Except tor the fundamental questions regarding the program!^ intended outcomes, no aspect of the 
evaluation isi more important than developing the questions that fl^d to be answered. As with program 
objectives, evaluation questions should be stated as spec^ticaify as possible. For example: 



6 By the end of the project year, can ah increase be shown in the number of schools using the entire 
curriculum developed by the program? /■ --J - ' ; ' ■* ------ 

o Can a' decrease in the number tff-arrests for driving /^while^intbxicatedrbe ?hovVh in Baker County 
over the first & months of the project? / 

b Can test scores of program partieipants.show an increase in^ppwl^dge regarding the risjcs of drug 
use during pregnancy? 

Obviously, th§ type* of change the evaluation questions Examine depends on program outcome objectives 
set forth by the organization. These first two phases of preparing for the evaluation ai*e interdependent. 

Because funding equrces and^ program managers sometimes waj||. different things from an ^valuation, 
the manager may want to set some priorities; Certain questions may le more important to the^rganization 
than others or may be more answerable given the time arid re^ur^ciss of the study. Fbr exampl^a^p^am 
director may be interested in comparing two different prevention strategies... However, tms^Rind of 
comparative 3tudy may be less pressing for the organization than havjng other information available to the 
county for the next funding cycle. / 



o 
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As in, the goal-getting process, a numbeF of considerations are helpful in devel^ 
questions. ' The manager must ask why arid for whom tjfte evaluation is needed- Program evaluations are 
conducted foi* many different reafens arid audiences, for example: 

To provide feedback for internal management to guide develqprffcprt of the orgariization. * 
To assure acTOiititability tovTO me external source; With decreasing availability of financial 
resources, programs are called upon to use evaluation results to justify new or continued funding. 
In some cases, the manager mky know exactW what criteria the funding source will use td judge a 
program. At other times, though* the prograip is forced to make assumptions about what kind of 

evaluation results will be convincing to authorities. _ 

To market new and innovative program methods. Other services provided by an organization may 
be well accepted iri the community, , and a/ manager may want to jjse ' the evaluation: to add 
credibility to more recently developed services. In particular, evaluation findings'may be used to 

support decisions about replicating pilot programs. f ______ 

To meet requirements of a grant or contract. The manager should, of course, look beyond the 
program's mandate for evaluation to consider ways in which the research findings can be useful for 
both the i&pgram and the mandating agericyJ . _ _ _ _ i _ . ______ 

To satisfy^the Curiosity of someone in the Organization (particularly in programs where innovative 
straJegLes are being used). Although such questions may have little relationship to f the stated 
program objectives, some of the most dramatic program effects are discovered through the 
personal-con v iction a rid questioning approach of someone deeply involved \n the delivecy--ot- 

services. ■ 7 - * - - * — ' - - - 

To respond to the needs of users. Evaluation is best formulated with participation by users 
regarding the questions to be asked and the way findings will be used. Don't forget anybody: 
legislators, school board members or county commissions, funding source representatives, boards 
of directors', program administrators, line staff* and conlomers. The concerns and viewpoints of as 
many user groups as possible should be incorporated intduhe evaluation questions. 
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Program decisibnmalcers must also recognize when an evaluation might be inappropriate. For example: 

6 When you don't know what the program is— there may be no agreement on program goals and. 

objectives^ \ 

o When you don't have the resources t6 answer the questions you need to answer, _ T _. 

6 When the answers won't make any difference— wheifc the potential ugrs of the evaluation results 

are unable <fr unwilling to tafce action based on those results. ' ~ 

These factors should be considered seriously by the manager before undertaking an^ ^ 
Although pressure is increasing for preyen^ efforts, 
decisionmakers should recognize when evaluation is incapable of yield^ useful results. 

WHAT KIND OF EVALUATION WILL FIT THE UROGRAM? 



Even program decisionmakers ^ ^o appreciate their role in developing program objectives^and,reseajrch 
questions may believe their involvement ends when evaluation methods are discussed. M^na^ers with little 
of no background in research techniques may be Incline, to iirtt^ 

available. In ^ctj^the^yaluation Ldesign and the selection of^ap^ instruments should begin with yet 

another set of programmatic questions best answered by the manager. In too lyiany cases decisions regarding 
evaluation design and methods are left entirely to the evaluator • This can lead [to LPlL^fMs> including the 
possibility that the resulting data cannot be used. Selecting appropriate evaluation methods begins at the 
program level with the question 

^ ■ How can the information be collected and presented * ^ 

in a way that wffl be convincing and useful? • 

Program managers cgn ehgUre the usefulness of evaluation findings by playing an active role in 
determining met^^ are human too. they represent r-a number of di^t^inea^iying them a 

variety of perspectives «nd experiences. The- manager should Choose ah appropriate evaiyatpr to help 
answer the program's questions. The major consideration J*^k *n 

partnership with the program. However, other factors influence an ev^gttorto ability to respond to program 
needs. - t 

o An^ev^uatiqn jnay address issues ranging from changes in individuals to effects i on entire 
communities. Inevitably, evaluators have varying levels of ekpefienee with different areas of 
social research. One consultant may be exceUeht for measuring change in indiyidu^ 
attitudes bat have little bfckground in evaluating a community organization project. The skills 
necessary to measure individual change or social change are iidt mutually exclusive, but the 
< manager should look for an evaluator experienced with the kinds of questions being studied. 

o The evaluator must be Sensitive to Ethnc^aphic studies* 

for example^ demand that the evaluatbr become intimately familiar witt the cu^ 
being studied.. Even with more traditional techniques, the importance of cultural sensitivity on the 
part of the evaluator cannot be overemphasized. In multicultural or ethnic communities,- it cannot 
be assumed that standardized instruments wili yield valid results. j>fd^ do issues such as 
language and_ methods of data c_o^^^^ also the community's norms for such 

things as drug use, social interaction; and healthy lifestyles. . — 

b Evaluation v ffieth^ typ^ ^ialitative and qfiajit _ 

Traditionally, only quantitative methods were acceptable it sound evaluation. practice. • More 
recently, a number of noted evaluators— Campbell (1975) and Crbhbach et aL (1980), for erample-- 
have moved away from jhsi^ M^JSfhed up the J>P»ihility of 

qualitative apg^aphes.' These include participant observation, program journals, and unstructured 
interviews* Depending on the prevention program, quantitative or qualitative— or both— methods 
' may be -called for. Evaluators, ^however, may bempre comfortable or skilled in one area, and the 
manager must strive to match the evaluators style with the needs of the program. 

These approaches are not mutually excludye. Nfa^ 
methods and attempt t^ measure trfiarge at both individual and group levels. Based on training and 
experience, eviluators may , approach .the project with a set of biases. Perhaps they have a favorite 
instrument used successfully with other programs, or a conviction about good evaluations that does not allow 



fdrx^^^^K^i^J^l^lV 1 ?!^ .l*LID£ _ c ??e, l.^^^I^ 1 ^- Jn^?ence_ the .dejrijffiiv and it is critical that they be 
able to respond. sensitively to the kinffc of evaluation questions being addressed. 

'. • 

Other program issues determine the kinel of evaluation to be conducted, including the needs and 
capabilities of the organization. 

6 Money! Good evaluation need not be expensive, but certain direct cost decisions must be 
considered. Will clients be paid for their participation in the evaluation? Will other professionals 
need to be hired? 

6 Program recordkeeping. Does the quality of existing records meet the information needs of the 
planned^ evaluat ion? ^ • ■__ 

o Data analysis resourced. Do resources, including computer access, exist at the level necessary to 
analyze the data collected? 

q Time constraints. Will the results be available-wiien they are needed? 

o Program staff availability and expertise. How much are program staff expected to contribute to 
each phase of the evaluation? Will they be able and willing? 

b Money! * ; ' - 

the kind *>f evaluation that fits any single prevention pro-am depends, in par 
finding an evaluat or with fpprppriate Lexperi^cei matching an evaluator to the cultural dimensions of , the 
program, deciding on Jhe appropriateness of qualitative and quantitative measures^ ^^dJobldhg carefully at 
the resources of the organization. There are also other factors outside-the organization's influence, such as 
the mandates of funding sources. In each case, the pr<^am 

^Asjgnjng _th? evaluation. The study itself involves far more than simply choosing instruments and 
interpreting printouts. It is a process of deciding how to ask appropriate questions and how to represent the 
findings in a useful and convincing way. 



WILL-THE EVALUATION BE WORTHWHILE FOR THE PROGRAM? 

* ^ 

In even the best-planned evaluations* where pr<?gram objectives have been articU^^ 
stated, and a study design developed, there on the part of the program 

decisionmaker. Will the evaluation process end up costing the program more than it 6ffers? For whatever 
??asbns the evaluation is conducted, will the findings warrant the amount of time arid attention it involves? 

These are important questions for the manage/' to consider* _|n every case, the process can be better 
managed if some of the potential costs and benefits of evaluation are first analyzed. * ? 

An evaluation project can cause disruption within an organization m countless ways^ Evalua^bn studies 
often bring with them^ additional forms to fill out* new » assignments for staff, demands for clerical 
assistance, arid iri^eased attention to prc^am details^ An evaluate 

and urif a miliar faces will be injected into the program's daily operation. Staff may feel the pressure of 
having their professional activities scrutinized, and awareness of outside accountability usually* creates sortie 
degree of anxiety. ^ 

■ Left unattended, these dynamics can result in serious resistance to the evaluation process. All other 
preparatory steps are useless if the staff does riot mairitfrt romplete the 

study. . It is i essential^ Ab_ fe JT®/P?® il Ab*^ -.^1*?^ ^r* 1 ????^!^ J?™™* 11 ® ali Possible ways in which the 
evaluation might negatively affect day-to-day operation of the program. 

To tr^exterit ^ the program should be drawn into the evaluation planning 

from its'inception. Ev'aluators should become familiar to staffs and the reasons for each corriporient of an 
evaluation design should be thoroughly explained at each stage of the process. _ 

in some cases, disruption cannot be avoided. The program jnigfit need to be modified to accommodate 
an evaluation design, for example* if the evaluation requires data on a _program f s parerit-education 
component, more emphasis may need to be placed in this area for a period of time to develop a large enough 
sample for study. ^ 

Other, clashes rriay occur between the program philosophy and aspects of the evaluation design. (Many 1 ' 
?? these issues can be avoided through the kind of design planning discussed earlier.) For example, some' 
program professionals believe it is unethical to raridomly serve some clients but riot others, a feature of 



sortie evaluation designs. Issues nri ay also emerge regarding the use of confidential information or the 
presence of an evaluator as an observer in group activities. 

These kinds of situations cannot help but be disruptive to a program, __Hqwever, to the extent that such 
changes are well planned and thoroughly explained they need not have negative effects. Potential disruption 
to both the program and the evaluation process can be minimized if the manager anticipates and plans for 
such possibilities. . I" 

Other aspects of the evaluation process may have unpleasant repercussions if they have hot been 
considered. For example, sometimes evaluation with^U^plahm^ Jl®g s _tiye results. 

Particularly where an evaluation may be used to justify the program's funding or continued existence, the 
manager must carefully consider the potential effect of less-thari^pdsitive findings. In the same 'way that 
. staff resistance or other internal effects of an_ e valua t ion process must J^exam ined, the J^nager_mUs_t_ also 
lpok at the ability- of- the-program- to-accommodate indicated ^r^recom mended chang£<— The evaluation 
process can be particSlarly costly to a program 1 that is prepared to receive only enthusiastic validation. 
Even negative evaluation findings can be used constructively if the program is resilient enough to accept 
criticism and consider change. * 

Program disruptions caused by evaluation can be offset by potential benefits. In addition to providing 

external accountability and support for prevention programs, . evaluation ^ Eternal 

decisionmaking and provide continuous feedback to staff, helping to modify or improve program practices. 
For consumers of prevention services, who either participate in the program or are concerned about 'its 
effectiveness^ ijuality control. JFinallyi whether the results are- 

anticipated or the findings are of any significance, evaluation can prevent what Weiss (1972, pp. 116-128) 
refers to as f, barhacle^ericrusted" programs. In other words* just by incoiporating the process arid rigors df 
eyallwtion, prev^ their program with creativity and continue to. grow and 

change in ways that improve their servicq^to people. • 

Successful _eyaluat ions are a marriage of jsrogram kno wledge, go^ management^ and research skills. 
For the manager, the importance of moving throug\ the planning stages described here cannot be 
overstated, each stage building on the other. WithbuiS^a clear sense of what, the program intends to 
accomplish, it is impossible to ask meaningful evaluation questions. Without specific questipri^^l^rbpriate 
methods cannot J>e chosen i__to LPP^du^t the^study. Without measures that ar^ sensitjve to the needs of the 
program, the evaluation threatens to harm more than help. Without adequate resources to analyze and 
interpret data* the best measures may come to naught. Without clear arid relevant presentation of findings 
to evaluation users, the whole effort may be fruitless. 

These are program issues. The success of any evaluation is intricately tied to the manager's active 
participation in reflection and planning. _TJiis chapter b^^ 

direction she could give to an evaluation when she couldn't even remember what a quasi-experimental design 
looked like. The answer: a considerable amount. Old college textbooks on statistics arid research methods 
are useful, but the manager's primary contribution to the evaluation process is understanding the program 
and what it needs to know, J 
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CHAPTER 4: EVALUATION ISSUES IN PRBVBNTION PROGRAMS ^ 

(The Heavy Stuff— What Else?) 

This chapter looks at evaluation from fhe_ev&uatbf!&_^ and is designed to provide program 

decisionmakers with technical information so that they can: 

appreciate the difference between good and bad evaluation design, 

-- j better Jinderatand^what evaluators do, 

become more active participants in th^evaluatibn process, and 
^become wiser consumers orevaluation. 

Technical aspects of evaluation are ^presented throughout the chapter.: Evaluation terminology is 
emphasized so program decisionmakers can better understand and communicate with evaluators. 

Evaluators, in designing an evaluation ofa program^ ef fectiveness, have aju wiiricttr^ res^hsibility to 
set tap the evaluation so the question of whether the program produces desired^ effects can be answered as 
accurately as possible. Accurate answers demand 

must understand these issues for two reasons. First, in using evaluation results to make decisions, program 
managers ne^d to be wise consumers* able to judge the quality of evaluations* rather than forced to tf*e 
results at face value with no understanding of how they were generated. Second, managers in the process of 
having evaluations [designed for their programs will be' better able to understand the evaluators activities. 
Evaluators often do, or_ ask program staff to dbi certain tasks that may seem a waste of time a{ best, of 
costly and disruptive of program fi^ managers who understand 

what is at stake with various aspects j>f the evaluation can contribute to the quality of their program 

"evaluation. A director vmay welt ask, "What will it take to convince others that my program is valuable?" 

-An adequately designed evaluation that documents the nature of the program and then shows its 
effectiveness is at the root of answers to that quration. ^ ^ "\ 

ISSUES *N EVALUATION DESIGN ~ 



__ _An easy way to consider design issues is [to scrutinize an evaluation. First, well describe an evaluation 
design, then well backtrack and examine it to show how, through faulty design, an evaluation can lead to 
ipcorrect conclusions about the program. We will then consider issues of theoretical arid technical 
importance in the evaluation process. - . 



The following hypothetical example was created to illustrate poor evaluation arid issues of evaluation 
design. 1 Jl* * ^zl— ^ 

Prevention program . ^Ttie program was intended to improve self-concept among junior high school 
adolescents in seventh and eighth jyadef^^e program's theory was that Jmproyed Lself-eoncept would cause 
a decreased desire to use drugs as an escape from the difficulties of adolescence, as well as an increased 
resistance to peer pressure to experiment with drugs. Tfte program w&a designed specifically for children 
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relating to family and peers; lTie program consisted of groups of students meeting with program staff once 
a week after school over the course of a semester. 

Program staff .—Two staff members each led one student group. The first was the school guidance 
counselor, who had substantial previous experience working with troubled adolescents and wanted to show 
the_^hpb_l system i the worth of such J)rb^ams v The seTO foreign language teactagr who was thinking 

of going back to school and changing careers in the direction of working with adolescents in a counseling 
setting. She Wanted to try leading u student group to see if she would enjoy intensive contact, with 
adolescents. The guidance counselor had been trained in the self-concept curriculum at a special workshop 
and had run the program one*? at. a local community cenier. She introduced the program to v the school and 
trained the foreign language teacher just before the semester begrfn. 



Participants . — Program participation was voluntary. The program was advertised jn the ^hooMhroug^ 
a poster campaign. Each group leader also solicited students to insure adequate ^participation. Finally, aU 
teachers in the school were asked to encourage their homeroom students to participate, especially those 
who seemed to have problems. 

_ Evaluation.-^The guidance counselor wanted data showing that the program was effective in improving* 
se if^oncept^^ who suggested [that sheuse \_a_ self -concept scale that he 

was developing and had already tested on some high school freshman and sophomores. Because he was 
interested in data from junior high students, he agreed to analyze the data in exchange for having the use of 
th? l results for further d^ He suggested that the guidance counselor ad I minister the 

test at the beginning of the semester as a pretest and at the £nd t>f the semester as a posttest . Since the 
program was ultimately supposed to prevent or delay, the use of drugs, the school psychologist' also; 
recommended, and the guidance counselor adopted, a well-known scale of self -reported drug use. 

At the beginning of the semester* the 2 groups contained 35 participants, 16 with the guidance 
cpunselqf ^and 19 with the language teacher. Participation waned jo_thatby the end of the semester only 18 
participants remained, 13 with the guidance counselor and 5 with the language teacher. 

Because the jfuidance, counselor was concerned about data conf ident iaU ty, she instructed the students 
not. to put any identifying information on their pretests or posttests. The only information she kept was 
which were pretests and which posttests. 

_:_ ^!^J_P^?hologist strongly recommended gathering self-concept and drug use information on 

students not-participating in the program, taking these measurements at the same time as the pretests and 
posttests. The language teacher asked nonparticipating students in her classes to voluntarily take the test 
at the beginning and again at the end of the semester. She got the highest response rate from her advanced 
language class and ended up with 20 pretests and 17 posttests.; *She also did hot; require identifying 
information 911 the tests, but merely kept pretests arid posttests separate. 

■ z----j- '. _ : ; • • • :■ _ _ - . 

The TChoorp^chbibgist analyzed the data using the t-test to assess whether average self -concept score 
was higher on the ^_sttest)than oh the pretext. He applied the t-test separately; to > group participants and 
nonparticipants and found/no statistically detectable sSelf-concept change in either case. A similar analysis 
showed novchange in the average scores on the drug use test. The girtdance counselor, clearly disappointed 
in the results, concluded that her program had no beneficial effect bh participants. 

Developing the evaluation.— As stated above, the guidance counselor wanted to show the school system 
trie worth of drug prevention programs by collecting data to confirm that this prograjn was effective in 
ilHPrpving se]tf-roncept. The -school psychologist pointed out that it would also be helpful to obtain a 
measure of change in self-reported drug use. ~ 

I It must b| recognized that the results of ev^uatiqiw are used as a 

persuasion. Unfortunately, not orrfy did the xjounselor and the psychologist neglect to consider whether they 
were asking the right questions, they also failed to identify the prime users of the evaluation findings and 
the ways the data could be l*se<J to A*P^l tt Jthe program's effects. Beyond these problems, the study did not 
adequately assess the theoretical bases of the program. ^ , 

■ ; -vj — 

>- ^ _ __ _ . _ _ 

What is tb\ be evaluated is at once a political and a theoretic^ mounted as 

drug prevention programs are not directed to drug use itself but rather to improving life skills, with thei 
expectation that a number of self-destructive and antisocial behaviors wUl be affected. Thus a self-concept > 
&^Sf^iJ>f fered as a _drug prevention _progra m, might alro by_t_he Juvenile l justice syst 61 ?* 

The underlying assumption would be that similar connections exist between poor self-concept and criminal 



behaviors such as vandalism or delinquency. In both instanc^-sclf-co 

But emphasis on the thorough measurement of drug use rather than criminal behayior Wbu)d^ in_ part, be 
occasioned by the agency funding the program, the concerns of the audience for whom the evaluation is 
intended, and the theory oh which the program is based. Each of these factors needs to be considered 
carefully to sharpen the focus of the evaluation. L 

r. Suppose the guidance counselor and her friend had sought a meeting with the school principal before 
conducting the evaluflkpn. They might have found th*t the principal: 

o didn't beUj^Hfi the worth of self-reports of such behaviors as drug use, ^ 

6 didn't beHwflhat improving self-concept had anything to do with reducing drug use, or * 
o didn't have the final authority to decide whether the program should continue, 

— Such-a i meeting could have^iri^d Jna^ 

the evaluation: First, the valu^ of self-reports is a measurement issue, the worth of a measure, that is, its 
reliability and validity, is an empirical question— one that can be answered by collecting and analyzing data 
or by reference to past research. 

Second, the link between changed self-concept and reduced drug use is an issue of the validation of a 
theory* which also can be empirically tested. The questions to be asked in an evaluation are derived. from 
the goals of the jsr^ram and Jh<rtheqr£ behind them. Third, the question of who has the power to use the 
information leads back to the motives for the evaluation. 



Three questions which j can lead a manager to a usable evaluation are worth repeating: 

u what do ybii want to know? 

* V why doyou^ant Ltojcnowit? 

how will you use the information you get* 

The first* the one most often asked, depends on thejbajs oi^^ 

of the individuals who seek the evaluation. And the third depends on the quality of the information as 
perceived by those who will use it in some decisionmaking process. 



T ^Obviously, j^se three questions overlap. The motives for the evaluation wiH dictate in major part 
what: research questions will be asked and how the answers will be used. Suchman (19617* p.143) named 
several ways in which evaluations can be abused. Sortie of these are: 

] Eyewash— evaluating only those program aspects which are expected to look good. 

Whitewash— covering up program failure by deliberately choosing nohobjective* or biased 
/information, such as testimonials. 

1 Stenariise—^eeking information on program weaknesses in order to destroy rather than improve the 
program* . V. 

• , a * 

Posture— seeking an evaluation. Only as a gesture to display scientific objectivity. 

Postponement— using evaluation as an excuse to delay decisionmaking. 

Sych abuses ire sometimes based on the desire to support unfounded beliefs about the program or on the 
desire to acquire or maintain power or status. Thes^ rhotiyes'are not tesery^tot the conscious abuse of 
eyaluatiojl i rese^ch. To ^ evaluation^ Directors wittiout faith in their 

programs are rare. TTie school guidance counselor wanted ta show others that the program worked, arid her 
belief in the theory was the cornerstone of her motives both for starting the program and evaluating it. 

At a different conceptual level, evaiuitions can be motivated- by th^ d^ire to improve a developing 
prbgramJbr by the desire to demonstrate that a fully developed program is effe^ 
prevents the eyciuation from seryir^ S>th purposes. In pur exam|fle,^e guidah 

satisfied that the program was operating according to plan. For instance, /she gave no indication that she 
was interested in improving the program by identifying group leader characteristics that might guide the 
selection or training of future leaders. Relevant to the second motive,^ 

by avvariety of reasons— to meet funding requirements, to enhance acceptance of the program, to test its > 
theory^ to support expansion, or simply to satisfy a jiotunffli curiosity. - 
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'- clearly; therjj is;a fr&atte^ and the purposes best 

served by an evaluation Even replications ot.well-estkblished programs are. appropriate tor eyfiLluatipnt jf 
only to hereby eff^tive^ that they,accurateiy reflect 

the intended program raqdefc in sufch ca^es. the program administrator is typically the decisionmaker who 
Will Use evaluation findings.; Evaluations of more mature pro-ams are more likely to be used by several 
decisionmakers, to to, un<Jer^t#nd t;h£ .'ij0t'iv§j? of all key actors and' information 

users to di^tlop a pertinent evaluation design.; ' r iVh ; ,v:, . • 



^___C!*rifyii^ ^ Jletferifiinejs what should be 

measured. A program begins with a set of goals, ^ese^oals^get , translated intp pro&ram activities which, 
it is assumed, will affect the behaviors encompassed by tfie goals. > Until the goals of a projwim have beeh 
?l©fi_r)y^®C in«d x and the link^-froitt goals to activities to ^utcomes-^iaV bfeeh niadej we 'have nt> guidelines 
for what to measure. In our sample evaluation, the guidance counselor g&v$ insufficient consideratipJ^ to the 
_pbtehtial-ef fects^bf-the program^ Changing self Concept i& a^^intermediate ; outcomei h^_atL^adJb^tielf^v 
The goal b/ the program apparently ^ 

outcome— preventing or. decreasing drug jiise; 'J?ut imprpved self -concept .might manifest. Jtself in'ptheip 
areas, such as school performance or improved relationships with family and peers. Sudh potential outcomes^ 
have to be specified and incorporated into clear operational gbal^ 

choice of variables to be measured in the evaluation. 1 dood evaluation fe preceded by a careful articulation , 
of the goals of a program. In bur sample evaluation no such activities^ apparently preceded the Choice bjt 
measures, hence the paucity of dimensions of outcome considered. An evalUator can be very usefql to 
program staff in helping them define and articulate goals and turn these into testable evaluation questions. * ' 

The importance of clarifying every step in program development can be illustrated by returning to-the. 
theory behind the sample program, which can be stated as a set of three ordered propositions, ekch building 1 
on the previous one: * 

■ • • * 

o There is an association between self-concept and drug abuse. Those who view themselves 

positively tend to abuse drugs less. __ • 

b A change in self-concept will cause a change in drUg abuse. As self ^concept improves, drug abuse 

(or Hs Lpotcntial) wjlljteerease. . . . , ■■* . 

o The program, as designed and implemented, will improve self-concept. 

This tfiebi^ implies a^ in the program will have reduced ljkelihopd of 

drug use; A theory is affirmed by testing its consequences.. If the program has no effect on the drug abuse 
patterns of participants, then at leasjt part of the theory is false. The association between self ^cbnaept and < 
drug abijse has been .docum ented ^ chan ?? s ! n 

self-corfc$pt causeji reduction in drug abuse potential is not clear. The falsity could lie here— in .the second 
proposition above— or it could be found in the design and implement at ion of the program, Improving self- 
ooitcept l^^onpept. In any 

event, when the implied consequence is false, then at least part of the theory behind it must be false. 



However, _when drug use is reduce that the theory^ij tr^js unless no other 

possible explanation exists for the change. In an infinite universe this is a practical impossibility* Logically, 
the truth of any theory cannot be proven; it can only be inferred with degrees of Certainty. At some point, 
however, the weight of the evidence bec^ r^fi^h^hle to act as if truth has 

indeed been proven. The; majority of people 4y the world are probably ^iot aware of Newton's Law of 
Gravity. Fewer are aware that_ this Law does not explain the phenomenon as well as Einstein's much 
strpHgeiv mpre_inclUsiye_ theb Even fewer would be willing to test .the truth of either theory by jumping 
out of a tenth-floor window. ■ * • - : 

The strength of a theory can be increased in two ways. Firstj if one tests the consequence several 
'times and. finds it true each time, the plausibility of the theory is increased. ,But this, requires enbugfv 
information on program activities to repeatthem accurately. The literature in the field of substance abuse 
is filled with evaluations _that describe prc^arrfcj so ina/de^ately_ that their activities cannot be Tepeated. 
Although these evaluations can draw conclusion!? about progfcam outcomes, they allow np opportunity to 
repeat the study. It is claimed (Patton 1978) *ha't* one team of evaluators paid so little heed to program 
activities that they actually evaluated a social prc^am tha^^ 

science, the team found. the nonexistent prognam to be ineffective. Outcome studies are incomplete, unless 
they clearly link program activities both to program goals and their underlying rationales. 

A second way to increase the plausibility of a theory is to test it against a reasonable, -explicitly 
formulated alternative theory and its implied empirical consequence. The more competing theories 
discounted, th£ more plausible the theory being tested. 
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_ To give arj idea of the complexity of testing a theory, here are some* of the competing explanations that 
might have been considered in developing tfce sample evaluation design; ■ ; _ 

6 Students might simply outgrow the tendency to abuse drugs. 

b The charismatic influence of the group facilitator causes the change. 

y 9 ?® r ?qnal attention bei^ 

o . Students who choose to enter the program bring to it an intent for change that could have occurred 
without the program. 

b The availability of drugs might have been reduced during the time the program was in operation. 

To the extent that each proposition in any theory has ilready been demonstrated, therv the focus of 
concern is changed. If the evidence that Changes in l self Concept caUse changes in drug use is sufficiently 
strong,^hen emphasis should be* placed on the program's translat ion of theory to' goals, strategies, and 
specific program activities. 

The more competing theories we discount, -/ 

the better able we are to claim that bur chosen theory is plausible. 

the most frequent complaint of eyaJuators, shortly _aft^ initiaTp that program 

objectives are not clear, specific, and measurable and sometimes are not even articulated. Often the goal 
statements written in funding proposals reflect the politics of obtaining funds more than actual ejcpectations 
t f??-- \b* :ff5^ a ffi^ Prp^ statements^ of measurable 

actions or behaviors regarding Jhe intended accomplishments of the program. Such statements are often 
referred to as operational statements. Because of conflicts |hat sometimes exist between various interest 
EPGUPJ^ASJ tidth identifying program goals 

and translating them into operational statements can be difficult and painful. 

Serut iniiing Mthe evaluation ^design.-- Lpbking back at the exa fhple's rtega t iye resUlt$ L J 1 wl_ jnUst_ :asfc 
whether the program really had no effect or whether the evaluation design might have allowed a real effect 
, to go undetected. The opposite is also true; an evaluation that .yields positive results may .show effects that 
jip not exist or are attributed to the program when they are really caused by something else. 



; r -^L _ In the case of the exampleLevaluatibh, there are substantial reasons to expect negative results, even if 
/ tftt program were effective. These reasons -span issues of both process and outcome evaluation. Keep in 
ffijn^itfiat evaluation is about the Jdentifjcatiqn ^of differences and 

irhplied. The (^valuator's job is: to locate the sources of differences, or variation. Any part of the variation 
that cartnot be explained is called Uncontrolled variability, and any source of uncontrolled variability in the 
design weakens it because it reduces the amount of variation that can be explained. 

Isfuef of prpc«s evaluation,— Process evaluation of the sample program was nonexistent. Many process 
( evaluation questions could have been asked that JJpuld have reduced uncontrolled variability. ,_ Firsts what 
about^the service delivery aspect of the program ?_ What cfid the guidance counselor and the language 
teacher actually do. in running their groups? Perhaps the guidance counselor went beyond the curriculum, 
whereas tFjfe language teacher, who had^ no prior _e^erience^ had lb strurele^ td_ present the material. 
, Technical competence .is not the only possible source of difference between the group leaders. The guidance 
counselor believed strongly in the .pr<^ram^ having introduced it in , the school, but the language teacher- 
s^ug^t the ^ because of personal* commitment to the prpgram. 

Differences between the two group leaders were a first source of uncontrolled variability in the design. 

Wh^ We have no information about them. Note that there were 

a number of routes into the program. A student could volunteer without anj- contact^ Jirpm the school staff 
or could t>e drafted into the program. Possibly the students drafted by the guidance counselor were a select 
^bup yvith by the language teacher were especially bright students 

becaptsp they were taking foreign languages early Jn their academic ^e&re^rs; Finally^ mil ^teachera were 
asked to refer students. Thus another* Source of uncontrolled variation was the nature of the participants, 
including the mechanisms by which they entered the program. " \ 

I What about the extent b^>articipatibn of the students? We don't know whether each participant 
• actually experienced the program to the same extent. Maybe sbme'students attended all sessions while 

others attended almost none. ^This expands our second source of uncontrolled variation to encompass not 

only who the program is reaching, bUt to what extent as well. - y 

^ participants?,_ We have some 



of the initial participants, whereas the language teacher retained only 26 percent. For a given [ level of 
technical competence, some staff wm better relationships with participants than others; producing yet 
a third source of uncontrolled variation. - ; :* 

_\ 

Note that while this differential attrition, or dropout of p quality of 

relationship, it might also result from* the different technical competence of the two leaders. Or there 
might be some simpler explanation. For example, the language teacher had a. number of advanced students 
in her grqpp and the local high school started a special program for them that conflicted with the schedule 
of the self-concept program, ; 

The evaluation design has obviously failed to give any JnJTqirmatiw 
delivered^ the nature of garticipants and their level of participation, or the quality of the reiStibnship 
between program staff and participants^ The bottom line is, we don't know whether the program-as designed 
was ever delivered to the participants for' whom jt was LinteiLdA^l Without this information, questions of 
whether the program worked seem either presumptuous or preposterous. 

Issues of r outcome evaluation.— Let us assume that a program rpf toqwn charac 
delivered and I that pafticipants did receive the program as- planned. In that case, issues of outcome 
evaluation are at the heart of the judgment as^ to whether the program had the desire^ effect oh 
-participants. These issues encompass four phases of an outcome evaluation: 

o p At the, design phase, how participants and nonparticlpaqts Were selected. 

b At the measurement phase, how the variables were chqseiu and Jhen how they were measured. ^ - 
_ b At the analysis phase, whether the appropriate statistical tests were employed and whether the' r 
i' : evaluation design'was sensitive enough to detect pro-am effects if they existed, 
o At the interpretation phase, to what extent one may gi*e meaningtb the data and generalize the 
findings. * - 

Design pftftse .— When we ask "Whether a program is effective* we arereally askirig whether participation 
' in the prc^am has changed iM P ro 8 r ? nl never ® x ^ sted# 

It is not enO 1 *^ ! 0 ■i^Ply measure changes within program participants. No matter hqw much change takes 
place, we haVe no foundation to argue that the ahange is due to the program.' Th^t argument can only be 
made by comparison. The ideal comparison would be coated by turning 

with the one difference of interjejitirf: the program during two otherwise identical passages through^ time. 1 
— Ih-the example, we would then compare the individuals with themselves l at the conclusion of ttfe two time 
periods. Any differences could then irrefutably be ascribed- to/the presence of the program— we could then 
prove causality. ; ':>i ' :: v 

Since time cannot be turned back* other* less than ideal comparisons must be found by playing a ■ 
scientific version of the game, , 

What would have happened if ... ? 

We can approximate what ^ Would [have ^happened if the program had ribi existed by comparing two groups as 
identical as possible except that one group does not participate. The experimentel qr_ tiVeatjneirt 
part icif>a_tes_in i the j>r^^m t the cb 

and differ only on the variables of interest after program intervention, program participation probably 

produced the difference. . 

" _ ' ■ _ __ ^_ • ■_ . ; ; ;_ : 

The comparability of the groups is critical. The sample evaluation included no systematic construction 

of comparable groups; only extraordinary luck migh : t have produced participant and nohparticipant groups 

that were initially comparable. So, ' 

' ; / / the best evaluatiorr requires • * 

> comparable treatment and comparison groups. 

Another^ less, elegant way to approximate "what would have happened if" would be to cbftduct an 
extended series of measures over time on the p^ the ; program. Then; if a 
sharp diTContinuity emerges in this time series once the program is introduced, the^ difference between 
expectations based on past measures and actual later findings is_ probably dye to the program.. A rnajbr 
problem with this approach is thitwe still cfifinot rule but .the effects of history, of events. ^conditions 
thfi^m a^^^ the measures. It is far easier to rule out such confoun ding _ ^ 

effects; if they exist, using a comparfeon group. 
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The major problem in making comparisons is in the Selection of subjects. Problems relating to subject 
selection continue throughout the evaluation. Eyeri wheri comparable ^ 1 
attrition of treatment or comparison subjects dUHn£ tfie evaluation weakens comparison. Af the design 
phpe,jrttempte should be made to estimate the amount of attrition and ig devise ways to minimize it. In 
the sample evaluation no consideration was given to this problem. - 

' ** 

Matters of designjgb beyond subject selection. AH matters of process evaluation are ideally settled at 
the design phase, before the evaluation begins, arid riot as an afterthought price the evaluation is started. It 
is possible to have designs that ar£ unable to detect program effects. Evaluators have the responsibility of 
designing evaluations that can detect effects of programs, if they exist, the number of participants is ^a 
critical part of this issue. In the sample evaluation, the number of participants was abysmally small, parti- 
cularly at the post test. ^ 

Cprifjdentiality and informed consent are also design is$ues.\ The guidarice counselor in the sample 
evaluation weakened the already insensitive design by riot providing the information nece^^ to match 
pretest arid pbsttest of individuals. Confidentiality doe^nqt requ^ire a c«5ra of identifying informa- 

tibn. One can ensure confidentiality and still be able to match pretests and posttests. Finally, ethical issues 
of withholding potentially beneficial treatment from participants assigned to comparison groups must be 
thrashed out at the design phase. r< 

Measurement phase .— issues at the measurement phase can be classified in two categories: what should v 
be measured (already discussed) arid how program outcomes should be measured. 

Measurement of outcomes is usually equated with the admiriistratibri of paper-arid^pericil tests* but 
measurement goes beyond this. Behavioral observations at the brie extreme an^ the other 

extreme can tffe used to measure the s^e vwiables- Regardless of the approach to jneasurement, a nrimber 
of standards must be applied. Are the measures suited [to ^the population being measured?, TTie guidance 
counselor in the example evaluation did ribL consul whether the students could understand 
self-concept test ^ a test that had been tried oWy with high school students, the content of the measure is 
critical: for example, items about whetter individuals feel confident of being accepted by a good college 
are better suited to high school students tfwrvrto younger students. 

The r^ability of a measure, its stability over repeated measuremerits, is jalsb a critical matter. If the 
same test measures something twice, arid the scores of individuals change Unpredictablyj then the measuring 
instrument is unreliable. We would ^ for example, throw buta bathrTOm scale that showed our weight to 
by 10 to 20 pounds each time we got on the scale. Such measures with a lot of "wobble^ introduce another 
source of uncontrolled variability in the design, in the sample evaluation rib attempt was made to establish ; 
the reliability of th^ measures, that is* to find but whether the meaaires were stable. * — 

An equal problem is whether the measures are valid. Just because the school psychologist thought he 
had created a test bf self^cbricept doesn't mean, iri_ fa_ct,_that^tt^ 

validity of a test, that it measures what it purports to measure, needs to be established. Just because a 
measure is reliable, does not guarantee that it is valid. However, 'reliability is a riecessary condition for 
validity. It is pointless tb ask what we are measuring if we are unable to measure it in a stable way. 

r Thus Waluat ions may fail to show ^opam effects due to mea^remerit failur.es in reliability '^rid 
validity. The school psychologist's self-concept test was of Urikftbwn reliabiUty and L validity; It is possible 
"that the participarita' selF-cbricept did [change, but that the ^self-concept test, beii^ unreliable, invalid, or 
not suUed to participants, failed to detect the change. In the same jnaririer, the sample evaliiatibri's drug use ' 
measure may have been inappropriate for this particular group! for example, by emphasizing drugs that 
students were riot trying, while failing tb consider other drugs that were popular. 

ft a good I evaluation, great effort is expended to develop sound measures. _ For example, the ev^ 
could ask jSiltry but instruments on individuals ^imHar to the participants, and perhaps to test them more 
than bncefge might ask staff members to participate in the process tb study the test administration 
procedures. In validating a self-concept instrument, the evaluatbr might cwk staff members to jdentify some 
students with good self-cbricepts ariu some students with poor self-concepts and then see whether the test 
scores concur w[th these jucfemerb>. Where school xecords are used L the evaluatbr may want to check on 
their accuracy before using them in an evaluation.: The sarriple evaluation failed tb deal with the issues of 
measurement that are at the heart bf gb6d evaluation. * 

Analysis phase. — Some evaluation designs are unable tb detect real effects of ihe program. When we 
say "detect real effects" we mean thatWstatistical test confirms a true change in some measure. 
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The ability ^pf a statistical test to detect real effects js c^ 
speaking, we call the change from pretest to posttest averages the systematic effect. But in addition to 
systematic effects, there are other, uncontrolled I sources of variability. Statistical tests work by^comparihg 
the extent of systematic effects noted in thfdata with the amount of uncontrolled variability in the data; 

The sample evaluation reveals numerous sources of uncontrolled variability— two vastly different group 
leaders potentially selecting vastly different types of student into the pr^ 

participaUon^ and validity. To use nontechnical terms^all this noise or 

slop in the design obscures whatever systematic change might have existed. As designed, the evaluation was 
almost doomed to show either no change or uninterpretable change before the data were ever collected. 

Much could have been done to increase the power of the example design. Ways to increase -power 
include increasing' the number of participants, linking the pretests and posttests of individual participants, 
looking at the effects of each group leader,* and gathering other pretest measures that are related to self- 
concept. V* 



Interpretation phase. — Let lis pretend for a mo men t t ha t the sam pie e va i lua t ion had been properly 
designed with comparable treatment and comparison groupsy and that appropriate data analysis led tb the 
conclusion that self-concept had improved by virtue of program participation in the guidance counselor's 
group but not in the lar^age^ teac^ 

implementation of the self-concept curriculum? First, we must ask to what population of Shildreri the 
results apply. Second; ask to what extent the program effects would generalize to other group leaders and 
to other ways of measuring the same outcome variables, such as self-concept. 

Hie answep^o the first question iabbvious. The results japply only to the population of individuals from 
whom the participants were drawn. Does this mean ^hat if the prc^am ^worked [for these students, it will 
wbr^for the student t^y at large? Not necessarily. These. participants,' selected through volunteering; or 
being drafted, were not representative of the school popular With more complete information oh th$ 
participants we could generalize about the_ type of student who might L respond to thejgr^ram^^ l_ T^A^A n ^M? 
cannot be LK^A^i 2 5^ J^^ 1 ^' t^e evaluation failed to identify a clearly defined target population and draw 
a sample representing this population. 

Another problem appears if the program works with one group leader but not the other. We must then 
return to proems questions about each leader and the quality, of her relationship with the participants.. The 
possibility exists that change was due to the- characteristics of the group leader rat^ 
curriculum. Charge can come from aj^arietjrof sources. The same sort of question can be raised about_the 
measurements: was any change or lack thereof peculiar to tjje particular test'ertiployed^ or would the same 
results have been found with other measures of self-concept? In all, we ask to what extent evaluation 
findings are peculiar to our program and the measurement of ijs outcomes. . y. 

The validity of an equation.— Every issue discussed so far speaks to whether* evaluation resUlts giye a 
valid picture of program effects. Four frequently discussed types of validity provide a way of thinking about 
the quality of an evaluation. 

Construct validity* — We cm j^rut inize done what lyas intended 

when we translated^the original theory to program goals arid then operationalized the goals to the prpgram 
activities. To begin with* we have a $et of abstract notions, or cohsfructs, about what we are trying to 
- transmit thrbugli i the pr<^am. We also: have what Re ^e tryirg to measu^ 

' for example, decreased drug abuse, improved self-confidence, increased acceptance of responsibility. The 
extent to which, first, program theory relates to program practice and then to evaluation activities is 
referred to as construct validity* / 

Internal validity .— If a change has been noted in program participants, we-still need to ask whether the 
change is attributable to the prbjparn»pr For example, if par^ 

decreases after a big eraekdgwn on drug dealers^ in the towrtjTwe wouldn't be able to clearly jttribute 1 the 
decrease to the program unless we had some daia from in appropriate comparisdn grbUp. The ability to 
attribute change^b the program as opposed to 'change from other sources js the internal validity of the 
evaluation. Whether an evaluation has internal validity is largely determined by the presence of comparable 
nonparticipant groups in the design, r 

External validity .— All questions of to whom, and to what situations, the results of an evaluation can Be - 
generalized are matters of external validity. _A design may be internally valid but have poor external 
validity dUe" to the highly restricted sampling of the participants or the unique conditions under which the 
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-_- Statistical confofcsion validity or conclu sio n validity, — Severht-times we have questioned whether the 
design was pdwerrui enough to detect program effects tjy.a statistical test. In fact; any set of data may be 
analyzed in a numb ways, some more appropriate than others. Issues of statistical power and appro- 
priateness of analysis can be summaH^ to 
in ^ accurate assessment of whether or not the scores of program participants changed; These are issues, of 
statistical conclusion validity, or conclusion validity. ' ' ; . 9 

^ ^_ 

In essence, accurate evaluation findings that are scientifically sound arid program mat ically useful are 
difficult to achieve. The review of the example revealed numerous threats to validity, or failures of the 
^ es '? n _l° . PfirfPl.t-^^^-^^^iy 8 "!^ 8 obput _pr<^a_m LMtepthreitew. For example, the small sample sizes and 
the unreliable^ measurements are threats to statistical conclusion validity; the lack ofjan adequate 
comparison group isja threat to internal validity; the lack of documentation of program activities is a threat 
to construct validity; the lack of documentation of the nature of participants is a threat to external validity. 



To sum nparize* confusion can occur 



at the beginning 
inside 
outside 

and at the end. ^ 



ISSUES IN EVALUATION METHODOLOGY 



[ \ _ the 7 previous ! discussion of evaluation issues has separated them into those of .process and outcome 
evaluation. We cbntinUe ttys distinction addressing first techniques and terminology in process evaluations 
then some important technical issues of outcome evaluation. 



There is some disagreement in the field. 6f evaluation research about the appropriateness of describing 
the giyJhering of _jntP?!P^foh Ph r program process as evaluation; Some purists would claim that since 
evaluation by definition makes judgments of worth, any information which simply ^ describe^ an pbject or 
phenomenon is ndt,^iri the true sense of the jfvord* evaluative. Others argue that since; description is a 
necessary prer^uisite ^ it is entirely appropriate to consider it as an evaluative 

activity, at least by implication. We take the latter position and claim that, depending ^ on the stage of 
prc^ajTi development,, it is reasonable to develop an evaluation design that consists solely of, process 
Lnfor^ition; Obviously/ outcome evaluation provides more information, but even the best outcome 
evaluation will include and build on process evaluation. ~ " [ 

- • . \ ■■ 

^Pfo^^ss evaluation can be used to provide feedback for internal monitorin&f to guide resource : 
allegation, and r aWlojigqi^^ 

sources and to illuminate the changing nature of a program as it eyolves. In this sense process evaluation is 
no more nor less than management information and can be an end unto itself. 

Process evaluation is also necessary for ^nking outcomes to key program components. A 1 comprehensive 
fyfiultiori tests hypotheses about the influence of specific program characteristics and activities on various 
outcomes. A careful process description of the program is necessary to understand the findings and to 
replicate both the program and its evaluation. " 

A basic distinct ion^in process evaluation is bet ween input and proem. To^ appreciate what happens in a^ 

program, it is, necessary %o know what has been brought to it. These inputs include human' and physical \ 
resources and the milieu in which the program operates. Each contributes directly to the actual operation 
of the program. > " ::; 

Program inputs,— Human resources include mainly staff and participant characteristics brought to the 
prograjn. Important s^af f cteracteristies include qualifications as measured by educational level,. training, 
and experience.; Formal education alone' is not a sufficient measure to judge abilities. Consideration must 
also be given to training and experience specific to the field of alcohol and drug abuse prevention. 
?fiV^?®H*® n t JlL workshops, ebnf ^fences, wqrlt activities related to J>reVention L ahd_ com munity inyolyement 
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alternatives strategies. Such skills are also important for, administrative staff, along >»jth experience Jn 
their expected roles. One basic measure _of stalf effort is expressed Im full-time staff equivalents (PTE). 
These can be calculated by type of staff activity for both paid and volunteer staff. 

v Client characteristics include a range of demographics, dependent to some extent Oh the type of 
prevention program. Basic demographics should be collected/such as race/ethnicity, gender, age, grade 
level, family structure, or socioeconomic -status. This, information can help to determine if the program is 
serving the intended target population. A major issue is thelextent of cultural dispmty between staff and 
participants, and its effects, both; positive and negative, on the program Process. These effects are a 
question for both process and putcome evaluation. 



Both staff and participant inputs should be measured at. program inoteption and at key points during 
development and at the study's conclusion* This allows the choice of~a stable period for analysis and 
provides information on changes over-time that could have a direct.bearing oncogram outcome. 



J- V 

The beliefs, values, and attitudes brought to the program by staff and clients alike will have a major 
impact on program effects. Staff and participant attitudes toward alcohol and drug abuse, and the. extent \ 
to which these views are similar, are important input considerations. Staff attitudes toward prevention will 
greatly affect program activities. It is a truism that events often coincide with our expectations of what 
will happen: Staff attitudes toward.drUg abusers and beliefs about the etiology of abuse will greatly affect 
the approach to programlasks. Stated role expectations for both staff and participants will influence 
i performance. Organizational as weH as individual expectations, and any discrepancies that -exist between 
them, will greatly influence program process. . 

Basic demographic data should be collected for all participants and staff. Personnel folders should 
xtetail past and ongoing staff education, skills, and training. Data on attitudes and expectations can be 
gathered from interviews (ranging from structured to open-ended) and observations by trained observers. 

' Physical resources include space, equipment, and supplies. Each type of resource can be disaggregated 
for future analysis in relation to program funetiqnsjmd activities. Physical resources are more amenably 
than human resources to easy conversion to a common measure-money. Money, m and of "sell, is not 
viewed as a true resource. Rather, it is a means of obtaining conynodities and measuring their value. If a 
program has a cash balance of $50,000, this means little*except as it is translated into the number ot 
counseling sessions or the equipment it will purchase. Monetary conversion of resources, process, and 
outcomes becomes a foundation for later standardized cost comparisons. 

Environmental variables directly, affect the workings .of the program. Descriptions of the socio- y 
economic structure of the community and>s population are necessary to develop a needs assessment that 
clearly identifies the potential participant pool. The incidence and prevalence of social problems are 
important, particularly those directly related to alcohol and drug abuse. For school programs, measures of 
variables S' ch as disciplinary actions, school grades, and vandalism are needed. 

Input data provide a basis for determining if the program " as i mplemented serves the intended target 
population, and if this population adequately represents those shown in need. Other relevant questions are 
whether the staff meet necessary standards and if resources are sufficient to accomplish program 
objectives. Specific questions must arise out of the particular program situation. 

Program process.-As with inputs, program process can be measured using both qualitative and 
quantitative indicators. Three basic aspects of a program's functioning should be^exammed during a process 
evaluation: *^ 

i' ■ 

b organizational structure- - ^ 

o patterns of interaction , 
o program service delivery. ' 

- The field of organizational analysis is growing rapidly, with increasing sophistication in methods. For 
example, structural analysis compares formal patterns, as found in organizational charts, with actual 
patterns of authority, responsibility, and communications. Systems analysisjJs^more concerned with 
measuring the dynamic aspects of the organization. One useful way to describe the organization is 
presented by Cline and Sinnot (L980), using five interdependent dimensions. 

The task dimension describes the organization as a set- of tasks interconnected by authority and 
accountability" relationships. Major iajRs find the activities undertaken to achieve specific objectives are 
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l^_^O^if j!?5^ a J^_^^? e ?^^? ^5!^ sehOcSP based prevention programs, two possible data sources are the course 

curriculum and job descriptions. / ' . 

_.___JNie fuiw^ describes the "organization as-a ; set ; pf operating units Interconnect by the 

ways in which they act arjtt react to^one another. „^hije the task jdijjension focuses on activities within 
units, this dimension emphasizes the interrelatiorvof units in achieving organize t ionai goals. A common ddta 
source is the organizational charts which is taken as 9 starting point for examining actual structural 
relationships. 



The information dimension is concerned /with mapping the flow ^ information and identifying key 
decision points. This dimension is closely related to the task and function dimensions, in that decision- 
making is part of the formal functions of various individuals and units. This dimension represents ^je first 
step in an analysis of the decisionmaking activity. 



The fiscal dimension describes the organization as monetary resources connectedly budgetary and 
accounting relationships. The major focus is oh the allbcat ion of resources^ which leads to measures of cost : 
efjectiveness. Budget a/id expenditure statements are the. basic source of information in describing this 
dimension. 

The personnel dimension^ which describes the organization _asi group of ^persons interacting or} a daily 
basis, is probably the most difficult to express in quantifiable terms and is more likely to be described based 
on observations of interactions. This is a time-consuming process, with the observer's major task being to 
limit. observations to the most important interactions. 

An alternative to Cline and Sihoti's approach encompassed the three. basic i assets of furtbtibn already 
mehtibhed-^tructure^ interaction^ and se/vice delivery— and develops a cbmpreh6rjsive deTCriptibn of the 
organization as it attempts to achieve its goals. 

The major emphasis of process evaluation is the delivery of services. Aft evaluation of services should 
describe intended Lc<^tent, the timing of delivery, Land ""its L integrity, that is; whether what is delivered 
matches what is intended; QuantitathS measures ctan include the number of meetings ox; sessions, the 
number of participants, the ratio of staff to participants, - actual versus expected attendance* arid the 
physical surroundings of the service deliyery. " 

^Qualitative^ and quantitative methSfe.— Only recently have. the arguments about the re]^tiye..merits..of 
quantitative and qual^ to reach a resolution. Cronbach, ei aL Q980, p.223) 

provides the evaluator ^ith a cautionary note: 

Ttoe^yaluator will be wise rybt to declare allegiance t^eitjier a quafttitative-manipuiative- 
summative methodology or/ a qualitativ^h^turaH He can 

draw bri both. styles at appropriate times artd-ia T^^eiwhb advocate 

an evaluation plan devoid/of one kind of information or the other carry the burden of 
justifying such exclusion. 

/ - • " / -e „ 

_ _ _j ___ ___ ■ '_ ... ; ____ __ __ 

Quantitative methods leading to.hypothesis testing view the program as a fixed stimulus applied to the 
social system. Thes^v methods employ experimental designs statistical techniques to determine If 
hypothesized effecjs^ccur. jt Is 1^ the term, "manipulative" methodology. 

The program is seen as a manipulation of an existing reality. [\ 

Qual|tatWe methods employ jwtiejpant observation, __ope_n-ended interviews a nd^5_ther so called 
subjective approaches to exarqrine the program as a system into itself, and as a part of larger systems. The 
emphasis is bri what! the program is and does as seerf by those involved. In the past, qualitative methods « 
were viewed by quantitative^ r«earcher£ (number ^ crunchers) only , as a way to develop and formulate 
hypotheses for future examination by objective quantitative methods. Now there is a growing recognition 
that -the ihfdfmatibp from /the two paradigms complement each other and that the issue of subjectivity 
versus objectivity should not be drawn along methodological lines (Patton 1978). ' 

; __This issue is crucial to the evaluation of prevention programs, where the cultural mix of participants, 
staf f>__and i_ j^Wiif Utiifcy _ Jiip jw. factor in determining the structure, dynamic^ 

The evaluator who] doesn't appreciate the enormity of cultural effects throughout the entire evaluation 
process is likely to f dd ^disservice to the program^ - . 
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» •. i _ _ _ f"j _ . t ' . _ _ : . _ . ^ ^ _ - . - - 

THe intdrmatibn, both quantitative and qualitative, that could tie gathered ih_.a/prM^_eval|ution is 

practically limitless; The major problem for the ^evaluate? and the program manager is to decide* what is 
essential for the evaluation.. Within resource constraints; limits should be set io. allow enough, freedom to 
identify key elements related to goal attainment* without taking away from the full richness of the program. 

Outcome Methodology ' ' 

in_»_this. sgpt ion, we present major technical matters thrt are critical^ to understanding outcome 
evaluation design and analysis. First, we cover the construction of comparable experimental and comparison 
grouj>s for M evaluation L design. Here, we consider threats to internal validity, v^hich are A^^ e _ r _? 1 L nni l n _^^ 
or produced by the^ construction of tbe experimental and^comparison groups* Second, we consider concepts 
of statistical inference. Finally, we review some concepts of measurement, expanding upon definitions of 
\ reliability and validity. ; ' - " 

Threats to internal validity. ^-Attributing change in program participants to the program itself requires 
= v proof tfiat . par t i cipari ts are _nu>r e different after exper iehcing a Jjrpgra m than they would have been had they 
not experienced the program^ The strategy used to make the participant-nonparticipant gompSrison is to 
construct comparable groups tSat do and do not participate and compare the groups at the same points in 
time. Perhaps the most critical isslie in outcome evaluation is how these comparable groups are formed. 

An obvious way to select comparable groups L§ to match two groups on Important variables. However, 
there's a trap in this— which variables to match. In a self Concept program* for example, we would want to 
match on ^variables known to be related to^self-concept. While we may not be sure what those variables are, 
we suspect the list is long, if we try to match but miss some critical variable related to selfrrconcept, then 
we can't claim comparability; bur evaluation is^uhdermined before we begin. Our theory for prevention 
needs to be carefully assessed to guicJe the variable selection process. . 

True experiments .— Aripther approach might be to take all the Lihdividuals who could be particijjants at - 
?Py_P°_ irit i 1 * _time_andj randomly divide^ them into participants and nonparticlpants. If this is done with 
reasonably sized groups, (e.g., JN-36), the result will be two groups theoretically comparable oh all variables, 
But how does sampling theory lead lis to this statement? 

Imagine splitting a group of iOO people randomly into two groups by flipping a coin to determine each 
person's group rfiembership. These groups should be approximately equal in hei^htj education i levels need for 
approval, anxiety; ^in fact, in _every characteristic one might n;ame. Why? Because the outcome of the coin 
'toss ts in no way related to any other variable, and the laws of probability are permitted to operate fully, 
/The coin cannot tell how tall* how well educated, or how anxious anyone is. These variables (by chance) will 
be distributed equally across groups. . 

This method of cbhstrlictihg groups, deferred to as random assignment, is the one method of 
constructing groups IheonetiMUy comparable on all variables that might influence the outcome of_an 
evaluation;; Experiments or evaluations using this method for constructing groups are called true 
experiments or randomized experiments. * . 

, Quasi-experiments .— Although true experiments are the most desirable, sometimes they cannot be 
constructed. For example^ if the whole fifth grade of' a school is to receive a program, rib fifth graders 
remain to serve as controls. Ethical jssues [may also preclude w^ temporarily, 
from some potential participants. These situations call for quaji-experiments, a„ category in which the 
experimental , and control groups are hot constructed by random Unfortunately, in quasi- 

exper im en ts, some internal L validity is lost. This means that if one does ; fihdja difference between treatment 
and comparison groups at the end of the experiment, one cannot be certain that the difference was due to 
the program's effect. It could be due in part to differences that already existed between the groups. 

So profound is the difference between true* and quasi-experimental designs in yielding answers to 
evaluation questions that the groups jri the two types of designs are called by different names. In a true 
^SP^li^^EtcJ]*? ^B^ rt L^P fi Jlt group is called a control group. In a quasi-experiment, the nonparticipant 
group is called a comparison group. . ". \.> 

_ I 5f J^ 1 ^^!?^ ^ true^ vermis quasi-exper i m ents. — Th<e r^a^on for av i ng control or comparison groups 
is* to mitigate threats to internal validity, that is, to eliminate confounding effects that prevent attributing 
outcomes to the program. To illustrate, figure la shows one possible outcome of a true experiment 
involving a school prevention progfaj]i- Both groups increase drug expejMmentatipn over the semester, but 
the group thai participated in the program shbwed less increase. The program apparently retarded the rate 
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of_increa.se in drug use. Now consider the saroe effect in a quasi-experi merit in which, volunteers were 
P^A^lP 811 ^ 8 a ?.^ ^^yP^nteers _were controls. In figure lb _the_ cbmparisori jjroup also _shbwed a greater 
increase. in. experi mentation than the participant group. Is this difference clearly attributable to the 
program? No. The self-selected treatment ^roup was less prone to use drugs thari the comparison group 
be forf^the ^program began. It is possible that the different initial levels of drug use* regardless of the 
program^ influenced the rate of increase in drug experimentation. The main threat then to internal validity 
in quasi-experimerits is the selection factor that brings the treatment arid comparison groups into the 
experiment. " * i 
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Figure 1. Some outcomes of true an^ quasi-experi merits 

Achieving random assignment through delayed treatment .— Randomly assigning individuals to receive or 
not receive potentially beneficial treatment is contrary to the belief that treatment should be readily 
available to all who wish it. A way to achieve random assignment jirid ultimately to have everyone 
?articjj)ate is to^ to some individuals* This useful technique for achieving 

random assignment is illustrated in figure 2. 
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Figure 2. Waiting list technique for achieving random assignment 



Suppose there were more i applicants for program participation than program slots. One handles this by 
Moving people wait until _ slots [become available. Assume there A r ^_ ?Ai>6bple oh the waiting list and only 30 
lots. ^Ttae~wajtjng list is used to construct true experimental and control groups randomly assigning the 60 
idividuals to brie of two groups. An immediate treatment group enters the program without delay arid a 
elayed [treatment group enters the pn^am after the immediate group leaves. The delayed treatment 
roup serves as the control group, as shown in figure 2. 

Alj^individuals .are prete^ _Next, the immediate Lgrplip receives treatment. When 

1e imrrtfdlate group completes the program, both groups are tested again. Finally, individuals in the 
e ley fed Xteat merit group receive their posttest at the completion of treat^ei^t. 
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Attrition destroys the benefit of random assignment .—Groups constructed by random assignment at the 
beginning of an evaluation should be equivalent at the end of the .evaluation if membership in the gr oup s 
remains stable. Differential attrition, or mortality, of^ 

validity of true experiments. In evaluations, every effort -^Rbuld be made to keep subjects in the groups , 
throughout the experiment. 

Random assignment to program componente frThere are nbt aiway^ Waiting lists, in iome circum- 
stances, even temporary nonpafticipants cannot be designated. - The evaluation then ijiay t>e ._a« contrast 
among variations in programing; rather flaan between pn^am and ho program^ For ej^mplejjf^ there isa 
convent tonal prqgrjun against which a novel program might be compared, then the random assignment might 
be to the conventional versus the novel treatment. * 

.—In statistical analysis there is a tendency with repeated measures to regress 
toward similarity, or to the group mean. This is called a regression .effect, regression artifact* or statistical 
regression. This problem is particularly acute when groups are selected i oh the basis of extreme scqref^e.g., x 
high drug use versus ^ low _<irug use. L.^yf.^perini^nts control, this problem _to_|» : targe _ extent because groups 
arie randomly selected rather than preselected. . 4n. quasi-experinients these effects. can b$ troublesome 
because of the process of forming comparison groups. Comparing ybluntee^ 

Volunteers in a cqmpwisqn gro^ selection factqrs that determine 

who will volunteer undermine attempts, to attribute any posttest differences to the program itself. ; 

This is^nbther .dimension to Jhe p^ aHies ^en selecting 

different groups so they match on specific variables. Fqr example,. suppose ^prevention program is mounted 
in a school with substantial drug problems, while the i com pari^n. group for_ evaluation might be drawn from 
another ^hqql with leM chjldren^qf both schools on drug use and to 

Select subsets of children from the two schools whose drug use levels matched. While this may appear to 
solve the problem of noncomparable groups oh drug use, it does, not* due to regression effects. 



Regression effects occur because measures are not perfectly reliable^ if the drug use scale is given 
twice, there will be different amounts of reported use. If the test were unreliable* a respondent with _a_V«ry 
low drug use score bhi the first measurement would likely have a h^igher score on the second measurement. 
T^hy? Tests do not have perfect reliability because respondents. change some answers between two test 
administrations, if^, student gives a very low estimate of use the first, time he took the test* the only way 
He can change his answers is to report higher Use levels. In contrast, if a respondent reports, very high-use 
the first time, the only way his answers canjchange is to lower levels. j 

Regression effect has nothing to do with i the true level of the behavior. 
It has to do' with the unreliability of the measure. 

For example* suppose the ^st asked* "How mjsmy times ^ 
alternative choices were 1-5, 6^0, 11-15; 16-20, 21-or-more times. A frequent user may_puizle over the- 



choices i6-2fl versus 2i-or-more, but can't really decide, and arbitrarily picks 21-or-mbre." He's got a 
use score. The next time the subject ehcquhters the item, he still can't decide and _ffi^dqmly qh^sesjhe 
category 16-20. TTiejj^bject's drugjise hasn't changed. What's changed is his random choice of responses in 
an uncertain situation. The same argument goes for the low end of the scale. 

._■ . ' .. .__ 

J ^l w« *sqrt tf Respondents Jnto two extreme groups based on the drug> use score on the first test 
administration and retest them, regression artifacts should cause the data to look like those in figure 3. 
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Figure 3. Regression artifacts in a single group 
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_ The extreme j scorers bh the first test administration i have_scores closer to the average (or middle) of the 
drug use range of their group on the second adminjs^tf at ipri. The amount of change is a statistical function 
of the reliability coefficient of the test being used, the less reliable a test, the greater the chance^f a 
regression artifact. 2 As shown in figure 3* one could hot attribute any change in outcomes to program 
f •^?_ ct ? ; _ ' ^°^ 1 4_ no ^^ nc ly d ^_ t ^ tJle program lowered drug use levels for high users or that the program 
increased drug used behavior of those in the low-use category. Obviously, the process i__sf selecting 
^treatment arid control groups has serious implications for correct interpretation of evaluation data, given 
the imperfect world of measurement. 



One way to attempt to achieve comparability is to match students on drug use from two schools, where 
average drug use levels differ. This situation is illustrated in figure 4. 
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_s Figure 4. Pretest drug use from two schools 
. . -h^e 

There are several major problems with this matching approach. The l experimental ^ school has a higher 
iverage drug use than the comparison school. Students from the two schools ate matched together by use 
cores. __Orily Jhbse students in the shaded area in figure 4c can be matched because the school averages are 
lifferent. The greater the difference in average scores, the fewer matches can be found. Thus the first" 
>rbbiem with this approach is that the sample size available for analysis is smaller, reducing the power of 
he analysis. * . " _ 



Further,, the students are being matched on only one factor— their drug use scores. The unstated an<^ 
n *toubteclly false. assumption is that students in the two schools are similar in ail other respects which have 

__k® ar _!H?___? n _ _5?rJ5f .™*!r_ ___ **5??f®y**»_ !?i???lM?i 1 5E*l*^?f^*tehItlg variables jvould further reduce the number of 
ossible matches, leading to even smaller sample sSes. 
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Figure 5. Results of matching from nonequivaient groups 

(Each proup regresses to its own average.) ^ ^ 
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Finally, the matching approach can substantially increase repression ef fects. The lowest scorers in the 
experimental school can be expected to have ahigher average score bri retestirig. For the same reason, the 
highest scorers in the comparison school will have lower scores on retesting. And these are the very 
students we have chosen by matching scores. As figure 5 shows, it wiH appear that the prdgraHi hSs caused 
increased drug use, while comparison subjects (with rib intervention) will appear to have decreased drug use. 

Statistical regression, or regression effects, operate whenever extreme groups ^are used in i designs, 
they are subtle and treacherous and most likely to creep into evaluatioj(£desigris when matching is used to 
achieve apparent pre^ ; v - ^ $ ' — 

to summarize: 



true experiments are more desirable because 
they overcome threats to internal validity. 

Concepts of statistical inference^— When we do an evaluation our interest goes far beyond the particular 
individual? who participated in the evaluation. We wish to generalize to other individuals who might 
participate in the program. Put another way^ concluding from an evaluation that A pr<® , ? T D_ worked and 
ought to b^ con tin tied or tried elsewhere, really predicts that the program will work in the same way for 
other individuals in a comparable setting. 

We base conclusions from our data on the rules of statistical inference, which i constitute s__a logical 
system for making ^jch generalizations based on probability theory. We will review this logical' system 
defining many of the terms associated with it as we go. 



Populations versus samples .— the first necessary distinction is, between populations arid samples* A 
population, for our purposes, is a clearly delimited group of individuals* say, all the fifth graders in a- 
particular school system. A sample is just a Jub^ randorn *y 
selected samples from a population allows generalizations about that population, arid statistical inference is 
the basis of the generalizations. If we could study the whole population, we wouldn't need statistics. 



Power and Type II error . — Although the purpose of statistical inference is to generalize from samples to 
populations, it's easier to understand statistical inference if we work backwards. Assume two populations of 
individuals who are identical. More specifically, the£ are identical oil the variable of self^oncept. Put in 
the usual statistical terminology, the two populations have identical self-concept arithmetic rriearis. 
Arithmetic means are what we commonly ^scribe as averages; they're usually referred to as means in the 
context of statistics. Now suppose we_as§igri brie population _tQ_a self-concept Jprograrn _?nd the other 
population serves ^as _a_ control group. At the end of the program, the mean self-concept score Jn the 
population that participated in the program is five pbirfts higher than that of the control population. ThatiSj 
there is a true difference between the population means. We conclude, all other things being equal, that the 
program produced the five-point advantage. ■ j - 4 

A ----- - 

Given this true difference in population means, suppose we do the following ^exercise. Draw a random 
sample of 25 people fro rti each population and note the difference* in mean self-concept in the two samples. 
Having recorded this difference, we return the people to the populatibri arid draw another pair of samples, 
note the difference between their means, and return them to their ^pJ^ulatiMs. If we do this repeatedjy, we 
will observe that the difference between the means will usually be around five v points, in favor of the sample 
from the participant (treated) population. Sometimes the difference will be greater than five points, still in 
favor of the sample from the treatment population, arid at other times L the dif than 
five points. In a few cases, perhaps, the sample from the control population has a higher mean score. 

j That is, individual samples do not perfect 1^ reproduce 

the populations from which they were drawn. . , 



To continue, suppose that instead of having repeated measures of the population^ 
brie pair of samples. On the basis of the sample self-concept means—in the treated versus untreated 
samples— we would have to draw a conclusion as to whether the pxograrn worked. What sort of rule might be 
used to reach a conclusion? We could use a rule that says* "if the treated sample is^above the um 
sample by any amount, decide that the program worked." Now, for most pairs of samples we drew, there 
would be a difference in favor of the treated group, and we would correctly, conclude that the treatrrierit 
caused a gain iri.self^bricept. In statistics* a correct -conclusion is brie that Reflects what is actually true of 
the populations from which Jhe samples were drawn. / * 

* 4o 
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In most instances, we would correctly conclude that ^ the prq^arn had ciftused an increase in self- 
concept. But for some pairs of samples* those An which there was no difference or perhaps aj^eversal; we 
would incorrectly conclude that there was no effect. This sort of error is called a T^pe II efror— more about 
this later. Note that this problem of, failing to detect a difference that reaUy exists in the population is 
precisely what we were concerned with when we discussed the statistical power of an evaluation design. 
?_ ower and ^P e LL^rors are opposite sides of the same coin* that is, detecting versus failing" to detect a 
true effect of a program. 

Or, in other words, _____ . . ^_ 1 !L. 



when you improve the power of a design, 
ybli reduce the chances for a Type II error. 



Type I error.— Now consider another situation. Once again, begin with two identical popu^tiohs, and 
treat one ^pUlatibh with the self-concept program. This time, however, assume that the program has no 
effect; that is, the two poputetijffi ^ Again, imagine taking pairs of samples' 

from these populations and calculating the difference between their -means over ^repeated Mmplings. Most 
differencelwill be about zero. But, from tirne to time, the mean of the sample from the treated population 
will be somewhat higher than that of the control sample. . In those instances we can make the error of 
concluding that ihe~p?bgram worked, when, in fact; it did not really work in the population. This sort of 
error is called a Type I error. * 

Keeping in mind ^ual population means (the program had no effect) versus Unequal population means 
(the program had an effect)* we can differentiate the two situations in the form of a pair of hypothesejsf. - 
One hypothesis, the null hypo thesis, says that the group means are equal; the program had no effect.^ thfe 
other hypothesis, the alternate hypothesis or research hypothesis, says that the program worked, that iff,_tne 
group means are unequal. Jfoje that these two hypotheses exhaust the possibilities for, the ^outcome of an 
evaluation. If vve can amass evidence the null hypothesis— of no effect— is false, then 

we are simultaneously amassing evidence that the alternate hypothesis— there is an effect— is trUe. 

Now, in the real world we have no knowledge of the population; we are trying to infer what exists in our 

population from looking at sample data. 

\ _._} Based I on probability theory. 

we make conclusions about the population(s) . 
and then qualify those conclusions \ 
by stating the odds that they are wrong. 

Again, let's say we observe a five-point advantage in selfrconcept in our treated oyer our control 
sample. We make the statistical decision to reject the jiull hypothesis, ttat is, -we conclude that the 
populations must be different because : th^ samples are as in the jgst situati on discussed. But 

there's another possibility; the population means might really be the i^ame, L as in the second [situation^ but by 
chance we've drawn samples that make it seem that the populations are different. Through probability 
theory we are able to determine the chance that we will have made ah errpr in rejecting the hull hypothesis, 
that is* a Type I error. < ' ' : - ■ 



The probability of a Type 1 [ error is eaUed the level of significance of the statistical decision to reject 
the null hypothesis. In evaluation reports, you will see sentences like, "The tr^tmeht gfbup had a 
significantly higher self-cbricept mean than did the control group ip<x05)." "Significantly higher" says that 
the^null hypothesis— the group means are equal— is being rejected. The (p<.05) in parentheses gives the 
probability that this conclusion is wrong. This is another way df saying there is less than a 5 pebceht chance 
(p<. 05) that the decision to reject the null hypothesis is wrong. Note that we are worried only a^out type I 
error when we are rejecting null hypothesis, that is, concluding that the groups are different, or that the 
program worked. A final point, the lower case Greek- letter alpha I XX ) is TOmetimes Used to indicate the 
probability of Type I error. Hfljen people ask what alpha level you're using, they're asking how much Type I 
error is associated with jrbur statistical decisions. It is only by convention that no more than 5 percent Type 
I error is acceptable to reject the null hypothesis. ' ■ . '* 

Pow^ranalysis.— Historically, science is conservative Hehce ; 'the emphasis has traditionally been 
placed on Tfrpe I errors. Nobody wants to conclude tWifcsome intervention or treatment has ah effect when 
it^oesn't. In the context of program evaluation, however* there also should be enormous concern for Type II 
errors— of failing to conclude that an \ e( fee ti^e program is effective because the po weL of the design is very 
low. The lower case Greek letter beta ( 0 j is- used to note the probability of a Type II error. 
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The power of a design deperids bri a number of facfors, such as the magnitude of the J^j^a^s effect; 
the previous situation in which the treated population wasfive points higher than the control pbpulatibnj 
samples from the two populations codld sometimes be expected to have the same means and to le^d to Type 
il errors. If the difference between means" in the populations had been larger,, say a_20^oint ^ difference, the 
chance of drawing samples that Showed [no difference thereby decreasing ihe^ 

.probability of a Type II error, or conversely increasing the power Of the design. ■'^ 

Uncontrolled sources of variability iri an evaluation design decrease the power of the design. To 
determine the power* of a design copsider*tfcie amount of difference betwieii the populatiphs relative to 
uncontrolled variability, the term effect size is used to mean the amount of dif^ 

program, relative to a measure of uncontrolled variability. The amount <?f uncontrolled variability is always 
considered J^teiive to the number of subjects in the design." Increasing the sample size increases tfie power 
of the design. V 

Ari arialysirof the ^wer of ^ design is best performed while the evaluation Ms being planned. To 
accomplish this, an estimate of the effect jsize (difference relative to uncontrolled variability) is rjfcuired. 
Efaluators will often ask if any pilot data for a ppgram already exist or can be collected before aral-scale 
evaluation is mounted, to make an estimate of effect size. With such an estimate, the number^pf subjects 
required to detect thpse effects in a evaluatioiljJesign can be 'detef mined: Sometimes the effects are so 
sm aft that enbrjn bus humb^s of subjects would tfe required X6_ detect them^ ^^ ¥ ^c_h_ta^firic_^__usinK large 
numbers of subjects, the execution of a labor-intensive and costly evaluation may not t>e warranted; 

Power ^analysis may also be performed .after* an\w_aluat4qn. This is particularly critical ^ when the 
evaluation has detected W effect of the program^ (the null hypothesis was not rejected In this case the 
* concern is whether the design was so weak in'^terms of statistical ^power that an effect that really existed 
could not have been detected iri the design. ' * *; 

Some common^statistical tests r— Statist icSl tests are calculations to determine what tire prbbabiUty of 
a Type I error (false rejection) would be if the null hypothesis were rejects 

error is low based bri a statistical test, say less than 5 percent, then we would typically reject the null 
hypothesis. . 1 * 

Many tests can be used, and the choice depends on the nature of the data. Here we mention or\ly .some 
very common tests. The simplest is the t-tiest, which tests whether tWo ^oups are different or not on some 
measure, using the mean. If there are more than two groups in the depigri, Analysis of Variance (ANOVA) is 
used for the same purpose, to test whether the several groups in the design differ; 

Analysis of Covariance (ANCOVA) is a statistical procedure that does what aSoVA dq^but also 
adjusts for initial uncontrolled sources of 'variability, L ihcreasing the power q^ the statistical design. ^ For 
example, if participants vary widely among themselves in self-concept before the program begins, then it - 
will be difficult to detect later changes. ANCOVA reduces this uncontrolled variability by linking various 
pretest arid pbsttest measurements bri each individual. ^ 

In quasi-experiments, where the treatment and cbmparisbri ^bupsare, hot equivalent, such statistical 
procedures must be employed to tease apart two potential sources of difference b^tyyeen groups at the end 
of the experiment: the effect of^ the treatment, and the initial differences between the groups^ Any 
stat ica l adjustments are approximate at best— they do hot guarantee accurate, estimates of the effect of 
the n(ptment. 

Cortcepts of •meaOTr^ment.-^Wheri we scrutiriized the sample evaluation, we identified two important 
properties of measures. ; First was reliabilityjjpr ^ the stability of a .measure. Second was validity,,or the 
extent to Which a test measqr^s what itjwrports to measure. 

Reliability. —The definition of reliability reaUyericbmp^ 
internal consistency. Stability means that if qne takes a test twice and doesn'L change on the trait being 
measured, then the test score also should not vary much over repeated testing. The usual wayin which this 
type of reliability is established is by administering the same test twice to a grOT 

a measure of the extent of agreement between the two test results. The basic measure used is called a 
correlation coefficient. Th£ coefficient will equal 1.0 if there is perfect agreement between the two 
measurement points. It will equal zero if there is hb relationship between the scores at the two 
measurement points. It will be negative, somewhere between 0.0 and -1.0, if scores get reversed over the 
tw£ measurement points; that is, if the high scbrera^at the first measurement point become the low scorers 
at the second measurement point arid vice versa. The correlation coefficient is referred to as a reliability 



coefficient iri this context. 
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The second aspect of reliability, internal consistency; is a measure of the extent that att tkje items or 
-questions on a test agree with one another, , or, measure the same thing. If we have a s^f^pricept scale, a 
persorv With a poor self-concept overall Should resjwhd in the MmeWiyjafcrj^ 

scale; 3 A common measure of such reliability is Crdnba^h^ Coefficient Alpha, another index t.hat equals 
zero If there is no consistency among items, and approaches l\0 as internal consistency increases. The 
Ruder-Richardson formula is ahother common measure of internal consistency. 



These measures are appropriate only with hoinogeneous tests, those where all ^the individual items are 
measuring one thing. It is possible to increase reliability of the score on the whole test l by increasing the 
number of ^ itertis on the TC^e. Statistical ^sUmates ]jave 6f the etftent of increase in 

pliability to be expected by increasing the number of items. The classical estimate is the Spearmari-Brbwri 
; prophecy formula. ' * 



Validity Validi ty of a measure is a broad concept. Thefre are -a number of ways tpvestablish validity. 
At»the lowest level is face validity; that is^ the content of the items seems to agree yiritfewljat the test is 
supposed to measdre. For example, in a test of depre^iohj if it ^appears that people who are depresed will 
respond in one way, while nondepressed people will respond in another way, then the test has face validity. 
iGoriburrent validity means the agreement of the test with other measures of the same trait taken at the 
same time. If psychotherapists identify a group 6t_ clients who are depreOTed Land a jgroup Who aren't 
depressed, and test J^ores agree with these judgments, then the test has concurrent validity. Predictive 
validity means that theJtest is, able to predict accurately what will happen in the future* If we construct a 
scale of Propensity to Experiment with Drugs, arid scores on this teSt^ taken at the b^inrjirig^bf the s^toI 
year are related to the amount of drug experimentation that occurs throughout the following scfiool year, 
then the test has. predictive validity. 1 

Construct Validity notions. It consider^ how the 

measure of a variable relates to other variables, on sbme theoretical basis* For example/ depression might 
be closely related to poor selfnsoricept and lack of hope for the future. 

be jMated to intelligence. Assessing how weU a measure's asociation or lack of association with measures 
*^otner constructs adheres to our theoretical notions is at the heart of establishing construct validity. 

'The assessment of the validity and reliabili [ty pf tests and othe^ 
e valuators ^will suggest that existing tests on which validity studies have been performed be used, ; in brdef to 
avoid having to study the validity of a teat created especially for a particular evaluation. 



> phe Worth of the Program 

_ _ So far, a major thrust of this chapter has_been on the ways in which outcomes cart be specified and 
measured. Effective outcome evaiuatida design and analysis can provide an answer to the question of 
whether a program causes effects that differ significant or in 

comparison to other types of programs; But the question remains— is the program- worth the effort? 

Worth, or value, is defined at a number of levels arid along many dimensions. In a m 
of hurjan culture, L all the social and political forms we participate in, are concerned with the continual^ 
redefinition of worth. 



Much of our social and [political life concerns the valuing of material tilings, even as tfe-liqk these to 
morp symbolic, ideal, or spiritual concerns. -^ITie material resources available to maintain anoy enhance 
human life come in limited quaritity. In most circumstances, therefore, we must make _<^)htinUal choices to 
y?e. ma teriaj resources for a3m^jnirp^es, Jeaving fewer ; f qj^ other purposes; All such choices involve both 
material' resources and the purposes we want them to fulfill. - . - 

In the last ^quarter _^ntury^mueh work has been directed at developing methods for valuing the material 
vfcbrth of pocial programs; junder the general categories of cost-benefit and cbst-eftectivehess analysis. 
Much of the following discussion about the worth bt the^prqgram focuses on basic cphcej^ 
analytic approaches. Remember, however, ^hat any such economic analysis applied to alcohol and drugf 
abuse prevention is itself worthwhile only in conjunction 1 with other social and political approaches to 
valuing. Economic analysis is an extremely fruitful wa# to look at a prevention pro-am but is riot a : 
substitute for continuing concern— and conflict—about th^numan values programs are intended to enhance. 

-t- • — - =. - -■ 

When we ask* "Is the program worth the effort?" in economic terms, we are really asking about ttle 
relatibriship_between i the value of resources consumed and the value of outcomes produced; When resources 
are invested in an activity, we expect that the activity will be effective^ producing benefits and th#t the 

' ' 39 , ■ ' 48 : - :> 
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benefits will outweigh the costs. The greater the benefits in relation t(f the costs, then the more economic 
worth there is to the activity. In order tbjneasure worth, we must examine the: 

o consumption of resources, i 
6 effectiveness of the Activity, and '■ «. 

d the relationship between the two. 

Consumption of rraourcra.— The costs of a drug prpgram are the values ot^ its 
activity. - Costs af^ most of ten expressed in units of money. Money, then, is a measure of cost; it is not in 
itself a resource. Thus we can talk about the cost in dollars per mile of a vehicle operated for a program, or 
dollars per hour of a group facilitator's time as measures of the vajue of these resources. 

To the economist, the cost of a resource TS the value of Its next best alternative use. If we have $100 
/and only two choices for disposal— put it in a savings b*qk at 5 percent interest 
"'membership in a health club—the ?cdhpmist_ would clpim, and rightfully so; that the iru4 cost of* the 
membership in the l|ealth club over a.year's time is $I05ftfie value at the end of the year if we had invested 
,the $100 iji the dyings account. In the same way* the cost of a facility for a prevention pfograln e^usls the 
value of whgt might have beeh_ produced by using the same facility for other pur^ 
distinction brat an important boncept. In using resources for prevention programs," we' deijy^their use for 
other activities. The dost of the resource is^ then j its fdr^ohe bppbrturiity—what we lost bjf riot using jt for 
other purposes— riot what we paid for it_. However, . in [a competitjve, !£pnomjeally motivated Fiarket, the 
market .value is the true measure of cost. A facility that is rented to the program at the going rate has a 
cost equal 'to the rental feel ^ • 

But where A market is not perfectly competitive or dpes not exist, cost estimation becomes more 
complicated. For example, the use of a facility may be doriated to a program. The foregone opportunity 
cost for the facility might be assigned based on current reltVal fates. .__Bjrt__whar .if the facility has b€&n 
vacant arid no one^lse was interested in using it? Although there are several ways of imputing costs in such 
Situations^ one common approach to ignore the costs of otherwise unusable resources* bri the theory that 
"the only free Jurich is the brie ribbbdy else will eat" (Yates 1980, p.47). . 

Costs include ^'ore than physical resources and salaried stSff^L Resources such as volunteers^ student 
interjns, or evaluation consultants contribute to program pperatior.v. The values of these resbUrceOL^ in the 
worth of their time. Pafticiparits.Vtime al^ has value. For ex^ 

prevent participants from gettirig'a job. Thus the opportunity costs of human as well as physical resources 
must be considered in calculating total program costs. . . • 

„._.„_:__■ t >' ' / ;_' • "... " J 

Direct and indirect -costs. — Anothfr dimension of costs is the distinction between direct ai)d indirect 
costs. Direct costs are represented by thejise of limited resources for producing services that wojuld riot be 
prpduced it tfie problem did riot exist. If Jrtcpho^ 

prevention or treatment programs or law enforcement and criminal justice activities directed at the 
pirolgms. ; Instead, these resources could be used fSr other activities that would enhance the social welfare. 

Indirect costs represent the loss to society of >hat could have been produced if drug abuse^ did jibt exist. 
Rufener et ai. (NIDA 1975) base their estimation of the indirect costs of drug abuse bri the foregone earnings 
of abusers.. This requires the assumption that_i_hcrea^^ for empjpyment is causally related to 

^_rySL_y??8^r_ _T h ^ unavailability caa range from; unemployment to work time lost^ for treatment, 
incarceration, or to the ultimate loss^premature dekth. Society must forego the goods arid services, that 
could otherwise have been produced, had the. problenfeiiot existed. 

■ ' a . V • *V - - 

Community and operations costs .— : The abotfe view of costing is referred to as the community .pr.sbcjal 
perspective. It includes costs to the program arid to ^ various components of t _While this 

perspective is comprehensive, the* estimation of many social costs is'difficult. This difficulty can be avoided 
by taking an operations rather than community perspective. The bperaiibp^per^ectiye merely looks at 
accounting entries in the.prbgraro's books. This approach jdpes riot .provide a complete l^isting of resources 
nece^ary to operate the program in the future and^does riot consider the foregone opportunity costs of 
resources. The operations approach Also tends tp bias codts in &tvor of programs that are socially appealing 
or that ^re located in cbrrirriliriities that can afford dbrtetions iff 'time and other resources. . 

Let us assume that the operations approach is takfen to ecj^aM the wsts;bf a drug preventio^pr°K rar n* 
The program is. located in space doriated by a local- com mufWy^re^ about orieTthird of' full-time 

equivalent sftaffrtime consisting of volunteers,:' tKS^sultro cost ^estimate Qjbuld not be esed as a gauge to 
predict, what ^'similar ^o^ijrt. wbiild cost in another community, where doriated ^space might : riot be 
available or where volunteers mi^hforibt be fbrthcSomirig. Also, costs could not be compared with' other, lefcs 



socially acceptable programs in the same* community that might not attract as much donated community 
involvement. ~ 

f hece is not a clear ^ operations costs; At a program level, one could 

decide to include costs and benefits that are not reflected in accounting ledgers* 

_ s .... 

TJte Jcey is to keep costs and benefits 
at the same level of generalization; 

Present \ M£luii^ .— Resources are consumed time. If co 
two resources, one of which will be consumed immediately and the other at some point in the future, then 
there must be some way to standardize the values to take the time difference into account. Itie economist's 
approach to this problem is to convert aA resources into their present value. 



Resources that must be spent immediately have greater value than those resource expenditures that can, 
be .^elayed for _^eLhdiJ^ at a future date. Resources that are hot spent until a later date cah bejnit to 
alternative uses until that time, producing a return. A penny saved is indeed a penny earned. The economist 
takes this into account through present valuing. Present valuing allows standardized comparisons between 
alternative choices for investments. * 

Suppose we intend to spetfd $10,880* each year for three years for a drug prevention program. Assume 
that the next best alternative use of this money would produce a 10 percent return. We will use this as the 
^]* sc ^^J^ te *_ Since the value of a Resource is _equal to what would be produced by the Jiext best alt ernatjyei . 
the present vjlue of the $10,008 to be spent during the first year is only $9,091, because $9,091 invested 
today at 10 percent would produce $10*000_at_ the end. of a year. Using ihe_ same procedure* thesecond 
year's expenditures have a present Aalue of $8,264 Jwhich Jvould produce $10^000 at the ehd_ of two yea w if 
invested at 10 percent interest), and the third year's expenditures have a present value of only $7,513. Thus 
we intend to spend $30,000, but the present value of our future resources is- only $24*888. 

' ' „_ — 8a* - 

When we discuss the development of cost effectiveness and benefit analysis, it wiU become evident that « 
the choice of discount rate plays a 'major: role in comparing programs. J5ifferent rates produce conflicting 
results depending bp the time, frames of expenditures and benefits. For this reason^ many analysts will 
report results using two or more discount rates in order to determine the effect of the rates on the findings; 
„. * _ s 

In summary, several major issues must be carefully considered when developing a cost arae_ssme_rit. Not 
all costs are easily expressed in monetary terms. The level of detail in collecting data and reporting costs 
must be based on a consideration of how much accuracy is added to the final cost figures. Data must be 
available in sufficient detail to allow accuracy in reporting costs for variables that represent the greatest 
use of resources, without befng unnecessarily specific. Certjijn^y office _ supplies_ represent an ^ 
cost,- for example, but one would not count the number of ball point pens used per month. But knowing the 
major costs of a program is only the first step in assigning worth.' The second is in knowing the benefits, or 
positive effects, produced by the program. r ' ^ 

Effectiveness.— When an analysis of costs and outcomes is conducted, the importance of identifying and 
testing for all relevant outcj^ej* is brought hoftie l forcefully. For in^ although reduced ^delinquency 

might not be an intended program goal, if [t occurs as a result of the program it should be considered as part 
of the worth of the program. •.: : : ■ 

We^ave already discussed the major methods for determining if program outcomes are statistically 
significant compared to control groups or to other programs with similar goals. In a cost analysis} we must 
I?* JI&H* to Jsp^Pity J n ® £ ™^ u nt pK|U!S^y ?_ to the program .It is hot enough to sa^ for instance, t ha Lt he 
experimental' group haH a significantly greater improvement in self-concept than a control group. We must 
know how rtiuch change can be directly attributed to the program.. 

Very often, we can obtain outcome data on a^level of greater specificity than we can coat data. Most^ 
evaluation designs hot only allow, but also require information regarding change at an individual level. 
Repeated tes^s of sel_F^ of change 

effort to gather cost data specific enough to give the ekact cost of the changes produced in individual 
participants. Therefore, for mpst analyses, the average change is used. However, this depends on the type 
of decision to be made, as we shall discuss later. 

Effectiveness can be stated in three major Ways. First, one could measure marginal variables, whTch / 
.compare differences oyer time for the individual participant, or between prevention approaches; A usual 
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example would be the change in .self-concept score before and after program participation. Typical 
evaluation designs use these kinds brtfbmparisons. 

Another way is the goal referenced comparison^where effectiveness is^ measured in terms of how close 
; the program comes to achieving its stated objectives. The catch, in this approach is that quantification of 
goaL statements is often dbrie intuitively^ and drily after the evajiiatibri effort can program i admiriistratbrs 
adequately state expectations for pro^raitv performance. To satisfy the needs of funding sources; the 
manager may write an objective which sayte something like "At the conclusion of program activities, illicit 
driig use among participants will have decrease^ byf 40 percent," Biit where did the 40 percent come from? 

J* A r^a^nable expectation based on prior experience^ or is it a number concocted to satisf^ the needs of 
others to know what they should expect from the program? 

• - , - ■ ■; - * 

_Jf * is reasonable given past _expqrien<*e L thert^dmpanng performance to the goal is a good way to 
assess effectiveness- But if the goal stkt^ment is either overstated or understated, then any comparison of 
actual performance to the goal statement has rib meaning. This illustrates yet another aspect of the 
prj^am rhe^ objectives. Very often the in fbr^ to 

state the objective r arises only from the evaluation that is supposed to be in part based on the statement; 

The final jfi iajor _ re fere nee for e f fee t jy eness var iabHjs is the aggrega t :e level of per for ma nee, of the 
norm. A program could be judged on the strength of its ability to reach the population norms for its 
objectives. A problem arises, however, when the norm is not a measure of what is desired. If et prevention 
program is directed at a group of adolescents whose drug use is higher than the norm (as determined, say, by 
national surveys), then how satisfied should we be to find out that the program has reduced drug use to the 
level of the general adolescent population— a drug use level that we are all concerned about. 

TTie relationship' between cost and effectiven^s-— Having discussed how to assess costs and 
effectiveness, we can now move closer to the issue of worth — the relationship between the two. Cost- 
effectiveness is the general expression i of the relationship^ resources consumed and 
• outcomes produced. If cost and effectiveness are expressed in the same terms, usually dollars, then the 
reiatipnship is referred to as "cost-benefit. 11 

The outcomes of social programs are not simply expressed. The problem is in assigning monetary valued 
- to enhancements in the .quality or length of life. What dollar value dp we place on an i mprovement in self- 
concept? How do we Jfxpress iri r mbrietary^erms the berief its accruing from preventing, one person from . 
becomjng a drug abuser? One measure of benefit is eapnin^s — the value of goods that could be produced by 
those prevented from becoming abusers. But how, then, can we justify prevention activities directed toward 
\h& elderly, who have no future earning potential? How can this human benefit be expressed in monetary 
terms? 

One solution to the problem of valuing outcomes that have rib market value (or an equivalent) was 
deyelpjxed when ecbnbmists attempted to evaluate the .effectiveness of ^_alterm^ 

systems. Given two systems with the same objective; it was not necessary to convert benefits to_monetary 
term 5. Instead, the one that achieved the desired objective at the lbwer cost was chosen. The major 
weakness of this approach is that comparisons can be made only between programs where the effects can be .* 
expressed in the same exact terms, such as increase in self-concept as measured by the same test. ■ h 

4 

In prevention we can express in monQ|ary terms such outcomes as recced treatment or Ln^^^^ion^ 
costs, increased earnings, and the like. The same issue of present valuing that was discussed relative to j 
costs applies to benefits. To determine the net value of a program, we must first discount benefits, or 
convert m one ta r i _ly expressed ou tco rri ~es_ to present valu e . _Hay i ng done so, it is si mple to subtract the 
present value of costs from the present value of benefits. The result is the present value of net benefits— a 
monetary measure of worlTi; Of course, a negative value jridicates that costs exceed benefits. 1 

Another way to express the relationship is by using the ratio of benefits to costs (or vice-versa). The 
larger the benefit-to-k^ost ratio, the greater the Worth of the program. A ratio of less than brie indicates 
that the present value of costs exceeds that of prevention benefits. 

A third expression for measuring worth is the internal rate of return, which is equivalent to the interest 
* the program makes bri its' investment. This rate is the one ^ that, when applied to the costs, will equalize the 
present value of costs and benefits. If the internal rate of return for the program is higher than the 
-accepted interest rate for social or private investment, then the program is worthwhile. 

, • : • ,* 

.Here is an example of the three methods; Say we have estimated the present value of costs for a drug' 
program to be ^1 OO^ODO^ with a present value of benefits of , $110*000. The difference is SlO.OOO^the 

' .• • , ■ •■ * 
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present value ! of net .benefits, this tells something about the program's value, but another program wlR 
achieve, ttie same difference with a cost of $20,000 and a benefit of $30y000._ The: second program lias 
invested fewer resources L to iobtfijn the sa^ comparison of the gresent-yallie of 

net bSnfefiis does not reveal that fact. More inforrpation results from calculating the benefit to cost ratios. 
Ttle first program has- a. ratio of 1.1 ($110,000/$! 00*000). The second' program has a ratio of 1.5— surely a 
significant difference. 

Finally, .calculating the internal rate of return provides even more information. With a 10 percent 
internal rate of return for the first program and for the second program a more sizeable 50 petcent, not only" 
cten the two programs be compared to each other, but also each can be compared to the accepted investment 
interest rate. The results of this third criterion, the internal rate of return, might not be congruent with the 
?.ther criteria because of diff ef en timing of expenditures of resources and the accrual of benefits. 

Results are also completely dependent on the choice of discount rate and on the time periods over which we 
discount costs, and benefits. As the discount rate increases, the present value of future benefits declines 
sharply. 

When making a choice between program approaches which achieve the same objectives, you need not be 
concerned with expressing benefits in monetary terms. To compare two approaches for improving self- 
concept, only accept a common measure and compare^ program outcomes and costs. In this case the measure 
might be increases in scbreston the Piers-Harris or some other weH known scale. Such a meast/re is. 
accepted in the same spirit aslmoney is accepted as a common measure in cost-benefit analysis. 

Of course, there are complications. In cost-benefit analysis; assumptions fcre made about mopey that 
might not apply to scores on a self-concept test. Certainly we would be quick' to say that a 10-dollar bill is 
worth 10 one dollar bills. But is a 10-point increase in self-concept by one person worth the same as lrpoint y 
increases i>y 10 people? Are we willing to 'accept these two changes in self-concept as equal in value and 
deserving of the same? No economic market establishes the two values as equal or unequal. 

Average and marginal costs .— Costs can be looked at in two ways in cost-effectiveness comparisons, * 
based on the question to be answered. If we can continue to support only one of two existing programs, theft * 
the average cost per unit eff^tiveness is ; the first choice for a measure. If we wish to increase the capacity 
of one program or the other, then the first choice is marginal (additional) costs. 

A ?sufTie that a prograrrv's effectiveness is measured by reduction of- marijuana users. Without 
calculating the exact cost per participant, we can obtain an average cost per unit of outcome by dividing 
total costs by units of outcome. Say that in a given time period the number of users is reduced by 2 percent 
for a total ^ program cost of ^1^ each percent reduction is $5,000. Compare 

this to another, similar program which is able to achieve a 3 percent reduction for $12,000— an average cost 
'Of $4,000 for each percent reduction. , If forced to choose between programs, we would j^ioose the latter, 
which achieves the same effect for $1,000 less per unit. , , ' 

If* iristelfe of choosing between programs, the question' involves increasing or reducing allocations to 
competing pr^rajpsL then an analysis of m^ Marginal costs are those that are 

necessary to increase or reduce the effect by one unit. Of two programs, say one involves awareness groups, 
the major cost being personnel, and the other is a fine arts club, with a majt>r expense in art supplies. 
Assume that the programs have equal total costs and effectiveness. Unless the first program were filled to 
capacity and had to hire a new staff person just for the sake of one additional participant, it would probably 
be more effective to give the additional funds to this program. Increasing allocations to this program would 
give a better return for an equal added investment. 



Cost-effectiveness analysis Rising average costs requires only aggregated- data at the program level. 
WArjffl'ti*! P^st analysis l^equir es_ some data _a-t the individual level. But these techniques only inform decisions 
to support effectiveness, not to improve it. * .? - 

," »» > 

The Resoorce-Component Model * - 

__ ^My^ry thing discussed so far is defined by Yates (1980) as assessment. He considers analysis as the 
process* that develops information after considering cost constraints, process characteristics, and 
effectiveness criteria. Program decisions are constantly made to shift resources to reduce cost and improve 
^//cctiveries^ Ya^ issues considered by any good adminis- 

trator. It starts simply, with the path of resources supplying a process that prpduces an outcome. 
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Resborcfii— The resources of a drug prevention program {are the facilities^ equipment * jnateriais, 
personnel competencies, and participant dysfunctions and competencies. As Yates (p. 94) notes. for. mental 
hc?^ their potential in jhe.cOmmunitY, are a necessary resource because without 

dysfunction the existence of mental health services cannot be justified." ■: 

Resource constraints limit every ^ re^urce available to 
the available resources in the sense that a change in the limits of any is likely to affect the othersi. A 
program in a^small facility cjannot expand its .staff or clientele Seyorid the limits of the facility. 
Competence of personnel- will affect client entry into the System. 

Processfc-^-The prpcess components are the technology available to the program and its delivery system* 
and there are constraints in both. Staff can always be better trained and better able to apply that, training. 
The constraints on tecjhnology ^ are measured by the best outcome possible. If use of a certain technology 
under ideal conditions ^prevents "only"? 95 percent of ail drug abuse, theiLjther^ is a constraint on that 
technology. We cannot stop the other ^ive percent from Using drugs.. Th^cdnslrairits on the delivery system 
are measured by the difference between the cdnstraints.bn the technology and the actual outcomes, that the 
program is able to achieve. ., 

Outcomes. — The ' major outcome in prevention ^is selNevident* In tecisionma^ 
considers other possible outcomes as well^both positive and Negative. It may be, for instance, that small* 
proportion of youth jwhb are taught d^^ these skills to reinforce values considered 

deviant by society. "The possibility should hot be ^nored, but rather certainly 
knowledge ofjvhp might have negative outcomes and' under what conditions can be helpful both toavoid the 
negative outcomes and improve the technology. 

.Application of the !^ model>--fhe competent manager,: considers all aspectsof the system for decision-" 
making. These considerations may be qualitative, or what many would cati subjective, because they . are not 
easily amenable to measurement arid have riot been externally validated by s<Mejitif_ic _JtLethb^._ _Cfirej^il 
cpst-ef fectiveriess analysis ca^s help validate decisidhtnaking as well as improve it through new and relevant 
information. At thte level of a sipgle program, analysis of the cost arid, outcomes of specific components in 
the context of restraints can pnjyitje 'information to improve pro-am performance by altering activities to* 

o produce a specified level of effectiveness with minimal costs, 

b maximize effectiveness with a specified level of costs^ or i; 

6. develop an optimal piix ; of costs arid effects. ; .. r 

in the example program developed in this chapter, the language teacher had a much higher attrition 
rate than thVguid.ahce/counseter. If We knew tte and _ drug- 

use scales of each group leadfer, we might identify differential outcomes related eittter^o (a) different 
participant types, or (bj different levels of competency of the group leaders.-. This c£uld lead to decisions 
regarding training or participant assigrirtierit. ■ 

'"Client routes of entry into the program might be related to differences in outcome. Of the various 
types of selection (self* btlwMtuderitlf thertwp group leaders, br other teach^ 

sott l^ types have better outbomes tha^^ some activities within the overall program might be 

more effective relative to their cost than oilers; A^' the number of variables tb be considered increases* the 
complexity of decisionmaking increases, anchthe cbst-ef fectiveriess of the arialysis itself becomes an issue. 
The : program much of existing resources shoujd be directed toward 

evaluation based on the expected return for the investment. 

Careful cost-effectivenes^ analyses base^d on accurate evaluations of outcomes can justify the 
continued operation of a good prevention program. But remember that all such analyses are based on the 
assumption of scarce resources. If resources were unlimited, costs would riot ''have to be Justified. In thepr^, 
at least, unlimited resources imply unlimited technologies. All problems could be solved. But in the real 
world, many resources are getting scarce, and the need becomes r greater to justify" the use of resources by; 
improving the social- welfare. It is at this point that the goals of the action researcher arid the prbgrarti 
decisionmaker fully merge. . 



5 



*. NOTES - ^ 

Ifhis Merlin-like approach is ascribed to Reichardt (1981) who purports to have taken it from Rubin (1974); 

^There is a result in the drug prevention yterature which is chillingly like tne present §iattiple._It_has been 
suggested that drug information programs, while perhaps decreasing use in frequent drug users, may well 
lead abstainers or infrequent users to use drugs. If you imagine a drug prevention program intervention *-*T 
between the tWotest admihistratioris in Figure you will realize ! how r^re^ibri artifacts may confound our 
interpretation of evaluations. A randomly assigned control group for the high and low users Would have 
clarified the meaning of the data, as in Figure la, in which there was less gain in drug use in program 
participants than in randomly assigned cqntrbls. 



sort/ of consistency is more clearly grasped in tennis of acHIevement and ability tests- On a test of 
mathematical ability^ there should not be some items which ar^ easier if your math ability is Ib^, WSVe 
probably aUTiad tK^^perierw more you know, the more difficult 

the question becomes because more than one alternative can be plausible- 

■% 
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CHAPTER 5: PREPARING FOR THE EVALUATION 

'•_ (They Say It's the Light Stuff ... But) 

It's your program that's being evaluated., 

A program decisionmaker should be involved in every stage of thj, program? evaluation^laiuiingj 
implementation,, and utilization— just as in any other major program activity. Each stage requires a 
particular set of skills^ a ^ and a particular Jnyolyemeilt. The ^ manager's r^Jn eaeh 

stage is described in this chapter, with emphasis on what to expect, what to do f and what pitfalls to avoid. 

The role of the evaluator will also be djscussed [ as 4t parallels ^n^Lifltefseets that J5f the program 
manager r ^^dittonally, the critical roles of staff, boards, and concerned community members or service 
recipients will be emphasized. ^ 

First, the requisites for planning will be discussed, covering such issues as: 

— .6 — selecting the evaluator ; - — -__ * • , * 

o preparation of self, staff, and community 

6 contracting with the evaluator. > J; *\ : 

< The second section will consist of a detailed discussion of the evaluation process (French and Kaufman 

1981); ' ' j 

■ •■ ' 

" step 1— analysis of decisionmaking activities ~— j; ' 

step 2—analvsis of program activity 

step 3 — development of alternative evaluation designs p. planning 

step 4— initial selection of a design - . , 

step 5^peratibnalizati6S-<>f the design " J 

step B^field test of the pjan ■_ 

step. 7— revisions resulting from field test' ^ L implementation 

step 8— collection and analysis of data ' * -1 

step ^utilization of results. . — utilisation 

The chapter will emphasize how success at eSch step depends on satisfactory resolution of previous 
steps. Discussion of *the ninth step, in particular, will demonstrate the dependence of utilization on all that 
has gone before and will also discuss the impact of the politics of an evaluation (internal and external to the 
program) on the utilization of the evaluation. ; ■ 

^ REQUISITES FOItP^ANNING , - . 

.— # 
* * 
;The following is a trae story* A State agency informed a local program JJtfectbr that his program was 
scheduled for evaluation during the year* The director was pleased, saying that there were many question! 
he would like answered. The State evaluator told the director that the agency wanted its questions 
answered, not his— questions pertaining to the success of the local system in adhering to certain statewide 



The director said he was aware of ho such standards. The evaluator replied that his team had only 
" recently^ written them, and they were still in draft form; the direqtor objecied tb being held accountable 
for draft standards; Not to worry* he was assured; they would become official standards once the legal 
bffitff's review was complete and would then be implemented statewide. *_ 

• :. ; .> . ■ ■ _ _ : :, : y ' "•. . - _ *_ _ ----- 

When the director objected to being held accountable I for draft standards of questionable legality, _t he 
evaluator reminded him that the state contributed two-thirds of his _ funding; With that, the director 
relented and. asked for a copy of the. standards. The evaluator then told him that, unfortunately, the State 
agency director had prohibited distribution pending legal clearance. f 

Thus, the local director fbu^ source, on its terms, with its 

evaluator, according to unofficial, legally questionable, and secret standards. 

* . 

The local program staff resented the evaluation, finding the evaluation team obtrusive and 
incompetent. Hostile letters were exchanged. The evaluation report, after a delay of several months, was 
distributed simultaneously to the local director, several funding agency staff, community representatives, 
and elected officials. The report, which t to review before distribution; 

contained several factual errors and many interpretations subject to dispute. 

Tpiis anecdote is a textbook case of how not to do an evaluation. The pitfalls evident in the, example 
can easily be avoided by adhering to the planning requisites and the nine-step process presented below. 

Selectin g the Evaluator , ^ 

The motives for an evaluation will h^ve major impact on what is ey&luated, ultimate use ofjhe results; 
the manager and program's participation, and selection ofan evaluator. , , 

Generally, evaluators may come from three sources— the funding agency, the program ijself, or an 
organization independent from both. The funding source may not only insist upon an evaluation, but. may 
also provide an evaluator. The program may have an in-house evaluator*or it may hire ah external 
evaluator. Selection of the evaluator may be the prerogative of the program manager, funding agency, 
.board of directors* or the like* depending oh the impetus for the evaluation antf who is paying for, it. 

-An important issue is to whom the evaluator is responsible, s_i«ee_ the L_e_valu^o|^ will give primary 
allegiance to that person. Allegiance is a majqr concern because alf steps of the program eyaluat ion will be 
influenced by the relationship between evaluator and employer! Everything will be affected including what 
•is done, what is inferred, what is said, (and not said), and who hears it. 

In general, most program managers will prefer not having an evaluator selected for (or forced oh) them, 
and will prefer to have the evaluator accountable to them. A manager who recruits i ;selects i and pays the 
evaluator will be in a stronger position to monitor^ the aims^ process, interpretation, dissemination, and use 
of the evaluation. No matter what direct authority the manager has over the evaluate*, several factors 
should be considered to asses? the evaluatdr!s appropriatene* 

_ > . '. . . _ 1 - ' ii. " 

. Technical competence.— By education and experience, does the evaluator have knowledge and 
competence to do the job? Can the candidate establish evaluation goals? I)ev€lb^ sound designs? Select 
suitable measurement techniques? Analyze and s interpret data? Write a coherent sentence that is also 
appropriate to the audience? 

Versatility .— Ah evaluator with a repertoire of techniques will be more likely to meet the program's 
heeds. As Pattern (1978, p. 31) says, "The burden rests with the evaluator to understand what kind of _ 
evaluation is appropriate for different types of programs rather than "forcing all progr*ns into ajwagle" 
evaluation model." 

^ The bbligation of the evaluator is to evaluate a program as it is, unless the manager agrees tomogram 
changes. An evaluator must have the flexibility to conduct credible L evaluations ^ 
program ahead of time simply to mee^ differentiyi effective evaluation is partly an 

art, and there's no reason to believe that Rembrandt painted by the numbers. • , 

. 1 • ' 1_ .. ' " ' 

Cultural sensitivit y. — If a program serves a community with a significant number of language or ethnic 
minority members, the evaluation will need to address issues relevant to those groups. Different goaLs and ; 
dif f erent^assessment techniques may be heeded. In addition to technical knowledge of measurement issues 

.- ... - 56 ■ ' 
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Involving: those groups, the evaliiatbr m list be Familiar witt) the values of the groups involved, exhibit respect 
for such cultural diversity as it exists, and be acceptable to the community. 

Manager/Evaluator Relationships 

* _ * thl relationship between managers and evaluators is often strained; Weiss (1977) suggests four sources 
of conflict. 



Personality differences *— The manager and .the evaluator are usually different types of people whose 
very differences drew them to separate fields where their differences were* reinforced by experience 
Evaluators see themselves as scientists contributing to the knowledge base of society; program managers see 
themselves as helpers who contribute through seryiee J provision. The former are data oriented, the latter 
are people oriented. Such'4ifferences provide potent^rl for conflict. , 

Role differences .— Evaluation implies judgment; the evaluated the 
manager the judged. This role^hierarchy is heightened when the evaluator is the agent of the funding source 
or -some other outside, controlling group, arid complicated when hi is hired by the manager. 

Lack of boundary clarity.— An evaluator f s role can be as limited as the analysis of existing data, or as 
broad as .helping a P^tam identify its goals, conducting a full-scale outcome evaluation, and then helping 
the manager make changes indicated by evaluation results. Because the evaluator's role boundaries are 
often left undefined, tensions are probable* 

Resentments over different ial rewayte. -^-Evaluators may receive-ug^re pay than program staff and may 
be perceived as less hard working— "We^a^: t*e work; evaluators read charts." Even the appearance of the 
evaluator's name on a final report can be a source of friction. 

That program managers and evaluators have. ^differing: arid occasionally incompatible world views is 
nowhere better illustrated than in an article by Weiss (1977), who conducted a survey of participants in 10 
evaluations of human service programs. Two major differences in perfect ives were found, one in the way 
participants view evaluations, another in the way the parties view each other. 

■ - Both evaluators and managers expressed-general-uneertainty about the purposes of the evaluations they 
participated in— whether the studies^ were to sepve^ the program, its funders, or knowledge in the field. 
Managers saw evaluatioris^ractically if riot ideally— as serving three functions: 

4 ; ; - \ ; . 

6 a ritual to secure funding 
- o an opportunity to vindicate^the program 
o a guide to change and improvement. 

^ In contrast, evaluators had somewhat more idealized views about evaluation functions* 



- O assessment of program effectiveness to enable decisions to~be made 0 
o an opportunity to contribute to basi£ knowledge* , 
.;_ : L'i. '« - 

Further, managers generally preferred evaluations focusmg oh^roce'S and de^ 
program development) whereas evaluators preferred those^eriiphasizing outcome and effectiveness to 
facilitate judgment of programs. When evaluations conformed more to the wishes arid beliefs of evaluators, 
mariagers tended to lose interest hr She-evaluations and to withdraw support; 

9 Weiss (1977, pp. 33-34) also suggests a fundamental mistrust of motive and viewpoint between managers^ 
and evaluators. Evaluators are credited with fighting "for the integrity their d^ta" in the face of 
attempts by managers to impose positive -interpretations on equivocal findings. tManagers are alleged to ^ 
grant autonomy to evaluators "less' front' resptect for the integrity of research than from unsophistication ' 
about possible effects of evaluation." Then, as sophistication increases, "there ^may JS^m^re interference 
with the planning and conduct of evaluation research." Evaluators see managers as hampering evaluation, 
"often out of ignorance." ♦ - • 



i In "another article, Weiss Xl 975, p. 15) writes tha^nanagers "are not irrational; they have a different 

^jTTOdel of rationality in jnind. They are concerned nofflpt with todays progress in achieving program goals, 
^ but with building long-term support for the program. Accomplishing the goals for jvhich the program was 

set up is not unimportant, but it is riot the only, the largest, or usually the most immediate of the concerns 

on the administrator^ docket." 



57 



— i_l 48 



ERIC 



> 

As these quotes suggest, a common view aligns 

evaluators with knowledge and integrity, 
* managers with ignorance and resistance, / 

suggesting not a little condescension. ^ 

Trying to compare managers and evaluators along a gpo<Hted-_pf is 
L rt JlPProp pi M?-__F|if more productive is viewing each party as possessing integrity, ability, and devotion to 
certain kinds of truth. Both are dedicated to doing the best possible job, but they have different jobs with 
different success criteria.. Eyaluators believe in arid fight for the integrity of their da ta; managers equally 
believe in the integrity of their programs. As a respondent in the Weiss (1977, p. 34) survey said, "Practi- 
tioners have to believe in what they're doing, evaluators have to doubt." 



Career^ to be successful in their work, but success is 

differently defined, and for neither is career success dependent primarily on the effectiveness of programs. 
Evaluators develop careers by conducting methodologically competent evaluations useful in guiding social or. 
P?<^a m pjjlic^ €lva A u A te ^ 
are successful is not their primary cJprfcerrE For program managers in human service programs, success is 
usually defined irt terms of {p^evity^ growth,- size of staff arid budget, and riumber of people served. 
Because many human service 5ro^ams are never ^ 

filed and forgotten more often than not, the actual effectiveness of a program may have little impact on a 
manager's career arid reputation. ' - ; ' ^ 



Attending to some of these differences and similarities should help managers and evaluators see 
themselves riot a antagonists,' but as complementary arid even synergistic •panriers in the enterprise of 
program evaluation. 

t 

Preparation of Self, Staff, arid Cbmm wiiy > f 

Preparation For an evaluation requires foSusirig on both technical arid context issUes. The former 
involves analyzing the stage of pr<£ram dey|^meritj__a^e^ssin£ jrif^rmatiori .needs, and determjning 
f eadiness f or evaluation, Jssu^s ^hieh^iU be developed later as part of the nine-step process, the context 
refers to the psychological and political readiness bfHhe^p<^rarii--attitudes, beliefs, arid interrelationships 
of managers, staff, service recipients* arid advisory or governing boards. 

, __ ____ _ _~ ^_ 

typically, evaluations are perceived t^y staff as threatening. At the least* evaluations will cause some 
disrjuptipri— there will be interviews* recocd reviews, aricTm.biPe forms to complete. At :_tfte LWdrst, evaluations 
/cast doubt on- program effectiveness arid st&ff competence, threatening t#ie esteem and job security of 
program stafk^The' lives- of staff are inevitably affected by an evaluation, to degrees ranging from mild 
disruption to distinct threat. • 

* " v • > [ 

♦ Service recipients, too, Say be directly affected by an evaluation process, they may find themselves 
being interviewed by strangers, having questionnaires or p^chblbgicalte^ the m t arid signing 
release forms. Further, any disquiet felt by the staff may be passed along to recipients of service. ■ 

Finally* parent organizations* such as local health departments, community mental health cejitef s,- or . 
boards of directors, may also'be interested in the evaluation and should be involved in the preparation 
pr dross. 

to create the best possible ^ context tpr an evaluation, two actions should be taken by the" manager. 
First, analyze, the relative importance of the motives for the evaluation and its potential effect on the 
prbgram. It is easy to fbbus "too much bri an imposed evaluation or bri the temporary disruption of the 
pr*^a^, bu^ tlie real _signif icance of ah evaluation lies with its potential impact. An evaluation report 
bksed on a month of frenzied activity may be filed unread, at the State agency; alternatively, an unobtrusive 
analysis of file data^cbriducted with little or rib immediate effect bri staff or clients— <JbUld liave a major 
effect bin i the ; program's future.' . ~: 

As a general rule* the greater t^e evaluatibri , §r pbteritial effect bit the prbg^am—pbsitive or heg§tive-- 
the more important it is for the manager tb iovqlye staff, cons^ the* 
evaluation process; Effective involvement of these parties^ although no panacea, will improve the 
evaluation process, create a, broader sense of ownership, arid make program changes easier to put into 
effect. " ' ' ; ' 53 ^ 
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Otheps miglR be also involved, but .there's rib simple guideline for ^ d^tepmining who. It iiti depend not 
6n_Jy on the evaluation circumstances;. but also on a program^ Lorgaryzatibrial size arid structure* relationships 
;• with consumer groups, and place (if ariyj within a larger organization. In addition to the director and the 
evaluator, three groups should [ be_ considered for involvement in the evaluation planning process: staff, 
recipients, arid advisory or governing boards. 

?P r _ . ^ny.. P-WS™!! 1 -*- **y _.ftaff _ifiust _ t>e involved* however that term is defined. It is usually helpful to 
include at least one person who has a clinical or provider (as opposed to administrative) orientation. — 
. -- » ■ r - ' " 

Consumer or recipient involvement may prove more difficult to obtain. If you have anactive consumer 
group in your community, the program is a st£p ahead. A citizens 1 advisory group may provide an 
appropriaterepresentative. the population at which the program is aimed has a legitimate investment riot 
only in^he\^ogran[i but in its evaluation. a 

Significant cultural or linguistic diversity within the target population will [complicate consumer n^ut* 
but make it more critical* Differences may exist between cultural groups as to program goals and criteria 
for success. For instance, a program aimed [at ^ adolescents* may have as stated goals reduction of alcohol 
use, increase in participation in school activities, and-enhaneed self concept. While Qrie cultural group may 
want no alcohol use by children.under 16, another group may tolerate alcohol use at home, and a third may 
be more concerned with alcohol-related arrests than with drinking per se. One group may b$ more 
interested in their adolescents having after-school jotoj thanjn whether they write for the school paper or 
play in the band. And, certairjly, the definition of self-esteem varies among cultures and economic classes. 
Accordingly, evaluation goals must reflect the diversity within the target community. 

_ t - - _ ' 

-Measurement issues are also affected in pluralistic communities. While the controvert oyer the 
applicability of standardized intellect minorities is well publicised, measures of personality 

and attitude should jalso be culturally relevant. The number of culturally tested measures is small, aBcK 
managers may legitimately expect evaluators to be aware of those that do exist. As a rule of thtimb, ] 
translations from English into, say, Danish or Vietnam measures of comparable meaning or/ 

validity. Review by representatives of the cultures concerned cah hfelp ensure not only adequate goals&m 
measures, but also acceptance of results. 

/ _ _* _ _ * • 

Finally, depending on organizationaTcircumst^nces, the evaluation should involve advisory or governing 
boards and concerngtPmanagers of the larger organizations within which the program may be placed. Before 
the evaluation of ficially begins, L three basic questions should be answered: What k being evaluated, how, and 
what will be done with the results? Involvement of key staff, consumers, arid concerned community or 
governing agencies in answering these questions is fundamental to prepare for an evaluation. 

Contracting with the Evaluator t « ■ 



The contract with the evaluator need nq^ be a binding legal document, but should express a clear 
understanding (preferably written or part of a legal contract) of the responsibilities of the evaluator and the 
program, and the boundaries between them. Seven critical and potentially troublesome issues must be 
resolved prior to formal incrementation of the evaluation. ' 

Division of labor.— Who will collect the data, Who will distribute forms, who will conduct interviews, 
and who will provide necessary training? The answer to any of these' questions could be the evaluator,*>the 
program; staff, students, or volunteers, etc The worst Answer is no answer; these are questions to be 
•considered inj^Jvance. ^ 

Division of resources.— A related isstie has to do with access to resources. Who provides typing, 
9 -photocopying, envelopes and stamps, computer time, paper, and thejike? 

Timetable.— Spe^ffying wejl in advance when steps in the f^ocess are to occur, J>r to be completed^ will 
help all parties ^budget thei^ime. Particular attention should be frflid ta time of delivery of the final 
product. Few things can dilute the usefulness of an evaluation more than results delivered too long after 
data were gathered. Prqgram people lose interest, funding cycles may be missed, or circumstances may 
have changed, ft will be^nfore tifelpful to have a finished evaluation 2 months before rather than 2 weeks 
after a budget isjjue. v- ■ A - " : 

Deliyerablp.— What you expect from the; evaluator should be stated at the onset. IVfake it clear if you 
T want a prelimirtary report. What kind of final report do you want? How many copies? Will you want some 
9 -public presentation or presentation to the staff ? ; y !* 
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[jistributioh-oj^e^lts .—You^ ptobably pathep leapft of the results 'directly from the evaluator than 
from the local newspaper. The final report belongs to the individual or; group that provided the impetus for 
the^evaluatibn and paid for it. Generally* program manages will want to receive arid cbritrbl access to th<| 
repohi to whatever extent possible. * 




■1 



tight of preview .— Related to tt^e issue of control q|Phe report's distribution is control of its content. 
jTthbut invoking debate about the iritegrity of* data^ ^ 
iVIanagfers will usually wish to see a prelimSiary or draft report and have the opportunity tp recommend 
changes/ make corrections, and discuss interpretation. The self-protective stance behind this Wish is obvious 
enough; at the s ame time, an evtfluator hoping to make a contribution to a' program beyond _the simple 
analysis of data will recognize the risk of Pyrrhic victories inherent in surprise attacks^- 

• " " ' <j . ■ * _ *' ^ 

Authority tb renegotiate .— Chances are that things won'Ugo exactly according to plan.- Staff won't 
cooperate, clients won^t show up L computers will malfunctipn, evaluators will decide to get married," or mail 
will get lost. Changes in agreements will be made, and the original negotiation should make specific jphb 
has the authority to approve or to insist upon changes. 



TR^EVALUAtlON PROCESS 



Planning the Evaluation "~ * 

Each J^f the following five planning steps is a prerequisite to conducting an evaluation. The activities 
comprising some* of- these planning steps may be familiar to pro gram managers, arid most will have highly 
developed skills in these areas. _ NeverthelpSj ev^ worth describing in some detail, 

especially highlighting the ways they fit into the overall evaluati^^rocess. 



Step 1— Artalysis of decisionmaking jictivities. —An evaluation is useful *o- the manager because it 
produces information for decisionmaking. The evaluator will suggest methods for gathering valid 
information, but the program manager is responsible for ensuring that information gathering is guided by 
the correct questipns—questions whose answers may be used to improve program efficiency, decrease 
program costs, increase program i effectiverie^f"or plan for the program's future. These questions will 
provide the overall conceptual framework of the evaluation, and their content, scope, arid focus will 
influence each step in the evaluation planning process. As Patton (1978) has hoTed, evaluation reports 
placed oil the manager's bookshelf and never used are almost invariably based on questions not relevant to 
the manager's decisionmaking activities. From this perspective, it is difficult to spend too much time in the 
analysis of program decisionmaking arid the development of evaluation questions. . - 

To develop questions that provide a useful framework for the evaluation, the manager must opfisider 
both short-term and long-term decisions and the information needed to. make them. Put another way, the 
manager arid other relevant decisionmakers (funded, staff) should develop a list of statements which follow 
the form: . ; 

WE NEED TO KNOW • BECAUSE WE NEED T<£t>ECIDE 



For example, the manager of a program emphasizing community planning groups might make the statement: 

WE NEED TO KNOW which alternatives programs .are most appealing to area youth BECAUSE WE 
NEED TO DECIDE directions the planning groups should take. J 



T 



Similarly, the manager in a multiprogram agency may make the statement: 



WE NEED TO KNOW which of our programs are most cost effective BECAUSE WE NEED TO DECIDE 
where to plan expansion. 



The development of the We-rieed-to-kriow-be^ list (which is, in fact, the'first draft 

set of evaluation question^) involves three separate activities: . - 

. , ... « 

b analysis of the stage of program development 

o assessment of information needs and development of evaluation questions 

o assessment of the program's readiness for evaluation and change. 
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Analysis of the stage of program developmCTt ^P^rd^am development is a dynamic process whjibti can . i 
be roughly divided as follows: -\ \ : i c> 



o needs assessment 

6 policy develbpfrierit 

o program design 

o program initiation 

o program operation. 



It is incorrect to view ^program developmentJa^ linear proce^, with eacfiph^s^cdjtiplete^ 



next.is began. Rather, a program may be i« different phases simultaneously, ar 
riot develop at similar rates or at the same time. The manager wifeask different 
stage of fevelopmeht L of the Lffrpg^am_ tejK of its various elements). Accordingly, ' 
of decisionmaking activity is: to determine the stage of program development of 
evaluation will address. 



all program elements may 
quest ionS defending bri the 
fir^t step irt ?h analysis 
Ise program elements the 



A major task for the manager in analyzing program development stages i^to divide elements of the 
program into those relatively stable, arid those that are evolving. All too often evaluations address 
©iitedmertj^ this program elefnerit ?hariging^ drug use?) about program elements thit afe not/' 

stable in either concept or implementation^ An evolving program element is much more likely to; frffe thej 
tes[t of outcome evaluation* arid a potentially potent program element may. thus be unnecessarily eliminated 
from furthe? CPhsideraUoh. Bee view the prograrij at only brie cross section J\ 

in time^ he will have difficulty assessing the relative stability of various progr'arp elements. The manager,! 
with iri-depth knowledge of the program's history, is in the best position to determine which program 
elements a/e stable and which are not. * 



Tharp ahd Gallimorfe U979) describe the cbjiditions_rieciBSsary for a social program to reach stability. 
T^ejr di^^ The -first is longevity. The history of prevention 

programing reveals numerous false starts and blind alleys. As a rule of thumb, a program element requires 
fit least 6 months to a year before it can begin to stabilize, and some program strategies {community 
organization arid social policy change) may require several years before stability is reached. 



The second criterion is stability of values and goats;. Prevention programs and program elements seek 
to remediate specific drug arid alcohol abuse problems or their precursors. Accbrdiriglyvprbgrarri elements 
will be stable onl^y^tft^^mf^^ stable problems [ in ways: consistent with, stable community 

values. The manager's needs assessment data and feel for the climate of values in the community will prove 
particularly useful in applying the criterion of goal arid value stability. 



The third criterion. j/f stability of funding. When different program elements are funded by different 
sources or on different finding cycles (often the case for prevention programs)* a review by the manager of 
funding stability will be {nost useful in de veloping^quest ions and focus for the evaluation. 

T 

Price the mariager Has considered the relative stability of the program or program elements, it will be 
^important to examine tte stage j>f development of jsjaff responsible fo^ Because 
of high staff turnover. Bites in many prevention programs, a well-established program element (such as a 
drug curriculum modulo; is often implemented by a new or relatively new staff rrierriber. When this is the 
case, the manager may|wish to postpone outcome-oriented evaluation until the staff jr [ember has L had ^me 
to fully learn the new role. Sometimes the competency with which st^ff implement various program 
elements is itself a focus^f the evaluation. Even when this is tWe case, a review of which staff jrierribers, 
are doing what tasks will help the manager develop questions for the evaluation. 



A final :mafbr issue for the mariager to consider in arial^zifig the stage of prcgram development is the 

extent to which various program elements have linkages to, and support from, the community^ In their 
Design for Youth Development Policy , Bird et ah (1978, p. 142) note tha^ -a given program "...acts 
simultaneously as a subsystem charged with handling brie br^mbre of the problems bri a broader scale for the 
community and the societal system of which it is J>art. T ^ Prevent ioit pr^fessiOTals recognjze this issue, and 
program managers have actively sotjght to use their community linkages to improve the quality and impact 
of their program elements. However* the development of such sharing of resources may take considerable 
time and effort, especially in larger communitij^ where numerous ^agencies compete for the resources 
needed for prevention. To the extehF-ihat the program is a drediye member of a community network, the 
mariager can^expect more stability ih, arid effect frbjtn> a giveri program element. Moreover, if a program is 
particularly well linked to other community agejicies, the potential for studying community-wide impact is 
enhanced. . .. _ 
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.Having complete^ an assessment of program linkages, the manager will have a good feeling for the 
stage of development of ifi^ i various program i elements. _ The analysis of the s^ 

will prove particularly Useful in chocking ^ the- appropriate level of evaluation (process^ outcome^ pr impact) 
and, in , choosing among^ vdriqus methodologies (qualitative^ ahd_ qtianiitfitiveJ. This ; analysis ^byides a 
background against which the manager may Begin to consider inforrtatibn needs and to' de\r<*lj^ evaluation 
questions. : I . % *^;2'!> ■ 



Assessrtfent of information needs and development of eva^t4bff^ Uestibns .— Starting from analysis of 
the program's stage of development and usjng the guidelines ffeVfgrth in chapter Jb the manager may now 
begin-to-assess the program's unique information needs, guided TTy the short arid Y0g term decisions ^faced. v , 
This will ensure that the evaluation addresses issues relevant to the manager's role as 
However, the manager is riot %¥M only decisionmaker information i from the ^aluation. Furiders, 
^taff, Community members, and even program participants and their families have valid heeds for program 
information. The wise manager identifies individuals who face decisions or need, questions answered about 
the program. ' * ^ . ' 

( Ration (1978, p. 2^4) Suggests , that people whose t iriforrriatipri needs shbul^ be considered include .people: 

b y^o ca^iTOjjtfor^^Jipn _ ' ^ w ; ? ; v ^ :; ' a ' 

o to whom infbrmatibn hakes a difference ..' • , > i ^ ' 

6 with questions tn\ey want to have answered ; . • \ j ; __. . .. ; '•• 

6 who care about aitcftgfre willing to share responsibility for the y e^luatio?x aJid its'utiUzatipn. ^ 

"As Patton notes, this. list boiJi down to those who come to mind when thoughtfully considering^arvin Alkin's 
(1975) question: _ - _ " ' ' / 

■ T \ / "Evaluation— Who needs it? Who care??". ; 

Once the manager has developed a list of relevant decisionmakers arid information Users^a set of 
evaluation questions should be solicited frorp them. This may not be an easy task, especially if program 
staff or participants, for example^ are riot Used . 
useful technique for soliciting evaluation questions is to ask these individuals to develop a list of we-rieed- 
to-know-beeause-we-rieed-to-defcide statements like the ones described earlier. ' b 

Such statements can be obtained in a number of ways, .ranging from formal focus groups to informal 
meetings and telephone cal|s or mailed questionnaires. The jhethod Will depend in part on t he J>l|«nal st^le 
of the manager and in part on situational constraints. For example, individuals may be geographically^ = 
scattered or simply too btf-§y to attend a formal session, the manager may also wish to alter the we-need- 
to-know-because-we-frfeed-to-decide format. : Patron's (1978) original example used an I-would-like-to-know 

about-this-program format, arid the manager will surely think of other useM formats a^^ well. The 

particular format is riot nearly as important as its ability to elicit important evaluation questions. 

./ ■ . _ _ _____ __ _ __;____. 

Usually, the information users prid decisiprim^r^ 
similar issues of program effectivjsness; efficiency^ and cost. As a side benefit, ttfc manager often gains 
new insights into the concerns of staff; board, funders^ participants, or community. For many managers, 
these insights alone are worth the effort to gather these statements. The pro-am ^an^er should combine 
the suggested evaluation gUestioris into a single, unduplicated list. If these individuals are Brought together 
in a formal meeting, a number of techniques. existfor developing a groups consensus* for example* the 
Nominal Groups Techniques (Delbecq et al. 1975). However, consensus concerning the list of evaluation 
qUestioris is -riot necessary or even always desirable. The finished product forms a first draft of the 
evaluation qUestions^for which the evaluator witi later devise methods and measures to answer. 

_■ : : __ __ : \ \ ■ . • ' . : _ . % 

Assessmeht of the program's readine^ for evaluation and change .— Once a first draft of evaluation 
questions has been developed, the manager's analysis of decisibrimakir^ activities is almost complete. 
However, before proceeding to the practical issues irivolyed iri arialyzirig program fi^ctivU i 
step in evaluation plaririirig), the manage shouW the climate for evaluation and change 

within the organization, arjd especially among program staff. 



It will riot surprise anyone that a Jarge literature (Delbecq. 1974; Lippitt et at 1958; Hage and Aiken 
1970) suggests that individuals and organizations, resist change". As the program managers are well aware 
(Kiresqk et al. 19S1, p. 221), • ^/*. 
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"one of the most pervasive barriers to change is 
a generic fear of change iri general, ^ 
a desire to maintain the status quo." i 

k ■ 62 



"_"?y„!_t?_very_to portends tlije ttatus_qup. But there are 

; other reasons why program staff and others^within the ^organization may resist evaluation; As most 
plariagers know, from a purely practical perspective, the evaluation means more work/ Prevention programs 
■*T_?_Pften_ ^ n ^ ep ftaffed ^ and Underfunded. It is the rare, program that uhas. staff with tim^reserved fyr 
evaluation activities. The evaluation may be viewed as an added burden with no apparent benefit to.thbse 
taking on the additional work, - - . 

Staff may also feel that tfie evaluator's tools are, indspabte, of measurirtg what staff are really doings 
This, concern rriajy be general, such as program activities cannot be adequately portrayed through scientific 
iilqyf^l^tf it may be quite specif iCj e.g.', the appropriateness of a given set of measures for the program's 
participants. Staff who have had bad paSt experiences with ©valuators will have little inclination to repfept 
the experience. Finally, staff may feel that they, rather than the program, are being evaluated. > 

~_ ^Overall, thl v manager may be faced with a staff who would just as soon forget the L.wjlble^idea of 
evaluation, and wiio may even attempt _tq undermine one thai is forced on them. Within such a climate, an 
evaluatic^^ at best difficult; arid' a the worst a waste of , everyone's time and effort. 

- Fqrtunatelyr'the manager^ can use two strategies to encourage acceptance of, arid even enthusiasm for, th£, 
evaluation; - \ '.' . > r> ^ . ' •■' v %*'t' 

''' _ \ J _' • ' * * ' " ; ' ' _ 

The first, already suggested, ?is. involving staff in the-development of ""the evaluation questions. This 
strategy helps build pwnefship of the evaluatior^and provides tangible bene|tfcs,from cooperating: the staff's 
information needs will be addressed, they will be "working for their own benefit.' Moreover, involving 
program staff in the development of questiofijs^d other decisions fw^^ a level of credibility 

v^ell above those ev&luations seen asbelongih^jp someonfe else and addressing softfedne else's-tjoricerns. ° 

: . v ■ - - ' " - '■ - M . ■, ' ■/ 

The second strategy to decrease resistance is to show^stpff ways in whiqh Revaluation cail L'_facilitatej 
rather than impede, their daily activities; Evaluations, especially those related to process, can provide 
program staff with mu'ch needed monitoring information and >hort-terrti feedback. For example, one staff 
member- of an alternatives program confessed that he Wfes often at a loss to remember Llroi> 0 i*ant specific? 
of planning meetings with program participants.' A semi-structured log for these meetings bqth met the 
staff rrierriber's immediate rieeci arid formed an important part of the pro|fparri's process evaluation. As part 
of the design Of a proceis or ou t the ev valuator .can ^^Ujelp^ sta f f to redesign, stream line, . 

routinize, and evei^cqmputerize recordkeeping to decrease the amount <n time these activities take. Once 
staff becoijie awaf^^pf the ways-dn which evaluation can aid, them in improving ttie day-to-day"operation of 
the program, theys^fliH t)€ft?ome avid supporters of the evaluation. 

.—>-'--.- . -_ - -' ' - j ' '-_ - . c _ ~ " » 

. With the completion qf step 1 (analysis of decisionmaking activities)/ the program manager will have-." 
developed the- conceptual framework for the evaluation, including a fa[ir idea of the questions to be • 
addressed. There will be some notion of the appropriate levels of evaluation for each program element arid 
the beginning of an organizational plimate to foster implementation of the evaluation. 

Step 2— Analysis of program activity— Before beginning to design the actual evaluation with the 
assistance of an evaluator, the .manager must examine certain aspects of tne program to determine their 
adequacy forihe requirements of the evaluation. Specifically, the manager will need to: 

• assess the adequacy of program objectives, 

review and catalog current data collection i methods/ flfnd 
f _ ■' review staff and other resources for evaluation. 

■■.■?„■■ . : * _ 

Depending upon level of skill and experience with evaluation, the manager may wish to ^pfist the help of an 
evaluator in completing some or all of these activities. • . 

r . .. * ; _■ 

Assess the adequacy of program objectives ,— In almost all cases, the manager arid others will want the 
evaluation to examine program effectiveness. From the evaluator's perspective, this question is always* 
asked in terms of the program's outcome objectives. While most program managers have extensive 
experience in writing objectives that are useful for planning arid rriariagerri§rit? a significant number seem to 
have difficulty writing objectives useful for evaluation. * 

Ctjntor et al. (198iypropose four useful steps that program managers can use to develop evaluable 

outcome objectives. Th^ first step calls for listing program goals; Program objectives are often developed 
that are only tangent iany related to progrlam gbals,_ Specifyirig goals will help in developing the objectives, 
W^Brstated I goals are : outcome oriented, L _They .specify the _cpridijion(s) the program hopes to address and the 
target population the program is expected .to affect; Because goals are so'broad in scope (e.g., reduction of 
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^. a . ri ly a . na use among middle-school students iri Lake City), most prevention -programs will have billy brie or 
two goals. ' ...» r 

The Secon^step requires the deyelbpmeht of indicators _6f goal attainment. Cantor et al. (1981, p. 4) 
define indicators as^'spBXH^c^observable changes in attitude^; knowle<jge,jgr behavior whioh are linked 
either by theory or (logic togSal attainment." Examples of goal attachment indicators for reduciioh_Jj| 
^aryuana smoking Jm ight include improved ability to resist peer pressure, increased knowledgLe of 
alternative highsror increased ability to cope with stress. Program staff and even program participants (or . 
potential participants) may be involved intirainstbrming indicators of goal attainment. * 

The third' step is the selection of the three or four best indicators of goal A tta L nrtl,| ?I lt -L. 5 ar l?^ p,s _?9 lJP 
criteria to select indicators include the significance and relevance of the indicator for <he program's target 
population, the importance of the indicator to, program decisionmakers, the ease with which the indicator 
can be measur ed, and the ability of IhiLbr ogragLlo Jiave an impact on the indicator. . . 

• • The final step in Cantor's process is the translation of. indicators' into measurable objectifes^ 
Measurable objectives include^ statement of thq iptficatof^ the Jde^ificatibri of a target population, a time 
frame, and the amount c^heiwai^kexpected. thus^measurable objectives Jake the form, *- 

"^yj^priVL^L^^^jl^^J^fB^?" J^.-Q^nt __Middle School wifr £$pbrt a- 20 percent increase iri their 
participation in alternatives activities," or r n l 

' "By January 11, 1982, 70-percent of the^seventtL graders will report an increased ability to cope~^ 
withoutjifugsi" r. 1 — \> f - 



^"Nptj^fhat 4hese object i^ej^ai^ stated as program outcomes or perfc^ There is 

a temptat^gi to write ^program objectives fahich relate* to activities rather than outcome's. For example, 
"teacher ^training ^will ftfe given ir^five schools during the spring semester." Such process objectives are 
use Nil for program management, but they ar,e <s>f limited value for evaluatir^j program effectiveness. 

* Review anrf catalog current data collection methods . — Pre vent ion programs yary^ widely iri the amount ' 
and quality of the records they keep. In some <feses, all the data collection necessary for the evaluation will 
already be in place. In general, however, new ^ata collection methods will need to be developed. In any 
' event', the evaluator wilL yyish to know exactly what records are currently kept, an<3 he will want, ah 
tfss^ssmerit of the quality OjP these records. ' _> / ; 



Basically, four categories of data are ^regularly required! for . prevention program ^valuation: 
participant, staff, program activity^ and program cost. Not all these categories wjll be required for any 
given prevention evaluations The. manager can begin to get a goodJNJfea of which data ill be Required by . 
referring to the analysis of decisionmaking frojort step 1- Working from the draft list of evaluation questions, 
a Data Needs Checklist can be developed. For example, if one evaluation question refers to community 
reaction to the program, the Data Needs Checklist will indicate a need fqr some kind of community at titude 
survey,;, Even the skilled evaluator sometimes, finds that not-alTthg nebessary data htfs fceen gathered to^ 
.answer the complete list of evaluatibri'qujestibris. 9 * 

With the Dptfc Needs Checklist in hand, a manager may begin to considefttfie data ajid records currently 
available. Client intake and exit interviews, school records, needs assessments, client records, and 
telephone logs are obvious sources. ^weye^Jthe ..manager may find that staff arid everi clients are keeping 
records such as logs and diaries that Jnay be ifcgful for the evalpation. Even if many of these records need 
refor mating for the purposes of the evaluation, data collection currently going -on will facilitate the 
integration of the evaluation into the day-to-day operation of the program. ; \ 

The evaluator will want to know about the quality of these data. Simply speaking, the quality of 
records depends on thfree characteristics: regularity, consistency or reliability, arid validity. 

; Regularity refeBs to the extent that the records are kept up-to-date. While busy staff may sometimes 
neglect paperwork without many, negative prograramatic consequences, missing data can be a disaster for 
the evaluation. Accordingly, quality records are kept religiously. . * r * 



Consistency or reliability Refers to the extent to which the same event is recorded iri the same way 
time after time. If, ior example, classroom acting-out is recorded, each* similar instance y6f acting-^>ut 
should be recorded in the same way. This requires good definitions of the events to be recorded, and it 
requires that all recordkeepers work from the sariie set of defiriitibris. Everi such sirriple definitions as what 
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constitutes a program session may vary widely from individual to individual. Consistency of definitions 
r cannot be assumed; — -- r - — V ■ • - 

, - - . ^ j - < v 

• Finally^ validity. refers to the extent' that the descriptions in *the record? accurately reflect what 

actually happens in the world,* For: any number of good: reasons, responsible individuals put things j into 
records that simply are not true. Often people do not die from the causes listed on .their death certificates 
Pf-At^ ifiot charged wit _h the crimes they actually commit; participant drug use may be over reported or 
underreported. The manager must be concerned that those records used for evaluation purpbses are valid 
reflections of the program. * ^ 

The manager will more than^likely discover that oth er d ata collection deyiceT^iU be needed for the 
/^valuation. Although the evaluator will be able to suggest a number of instruments, observational 
^checklists, and so on, the program manager may also wish to begin searching for additional data, collection ' 
devices. Readily available sources of instrument descriptions include: 

"* - ^ * ----- ' A <- \ * ' c 

o The appendix to the Handbook bf Prevention Evaluation (French ahd Kaufman 1981) 

k - o the P reventipgV Evaluation Research, Mbnograph s , Outcome Volume (Aiken 1981) 

o The Drug Abuse Instrument Handbook (N1DA 1977). -—-'J — — — % 

Review- sta ff and other resources for evaluation.— The availability of persons with various skills (jfcid 
with free time) will probably, be the single greatest constraini-bn the extensiveness of tfte evaluation. A 
discussion of available resourced witlrthe evaluator will be ari. in^b^artt first step in developing evaluation 
design options. • J " * * 

: -\ - f 

Basically,' all evaluations require individuals to collect, code, and analyze data. All these individuals 
(witn the possible exception of data analysts) can, prob&My be found within the ranks of program staff. A 
. tarief description bf ^the tasks that mtisj/ be v jperforjneS^ibllows and will allow the manager to begin 
/considering WhicVi staff might do what. J * ' .\ , ' ' a 



Data collectors fall into three basic categories: interviewers, questionnaire administrators, and trained 
observers. Of these, questionnaire administrator^ require t£ie least training, while interviewers and 
'^^r.^r^J^'^^^^^AUX.A®^ A _\f^PAl Jl?trj5dUction £o lJ.^\r_yole^/^lj\MX}_ case," however, is academic 
preparation <3frectly relevant. It is more important that these individuals be comfortable around and enjoy 
people. Usually interviewers and observers can be. trained in aj-day session, although a complex- interview 
or observational protocol may reguire >a soraewJiat_Jo^ ^^inistrat^rs may also 

require a small amount of training to insure consistency of instruction ^jving and interpretation of items, 
but this training shSuld rarely take more than a few hours. In. general, {he qualities found in mos£ prevention 
program' Ataff (concern for and^ Jo pther^ some clinical insight, good com muriicatibri skills) wilt 

make them ^xcellent data. collectors once properly trained. ^ ' ^ ' a V 

Data coders are responsible L^fbL : ^ &t A st P*l a S^ for 4he coding of questionnaires, interviews,^and 
observational protocols. Their task may be as simple as transferring numbered responses to code sheets or 
as difficult as deciding whether an interview" response fits into one, or another category. Irt general, the* 
work of the data coder is not difficult and almost 'everyone can help our in this role. Data coders must, 
however, be able to do'detaileS work accurately, the quality of data coding will have a direct impact on the 
overall quality of the evaluation. * < / , ■ 

Data analysts take tfceraw data and prepare summary statistics, charts, tables, and graphs! Depending 
on the ^valuation design, they may also perform statistical tests of evaluation hypotheses that range from 
f?^A t ---?-iy Ai™Pte to_hjghl£ complex. . Ordinarily, graduate training in the social sciences, or statistics is 
necessary fpr any but the mdst rudimentary statistical analysis. How-to books on L th^statistfcal analysis of 
data do exist (Fitz-Gibbon and Morris 1 How to Calculate^tatis'tics is one good example), but these are of 
limited use. Unless the manager or staff have training in data analysis, other resources for this aspect of 
the evaluation should b^ sought, % 

Besides person power, the manager will need to find ^some resource for computing. Unless the 

evaluation is completely qualitative (which is rare)j or only a^mall quantity qfjdata is collected even the 
simplest data analyses become overwhelming without the aid of$ computer. Some agencies will have access 
to computers through a school system or local government, and a lucky few may even have their own 
computing resources. However, the manager will often have to4ook elsewhere for a computer. . 

Happjly^most prevention programs are close Enough to a college or university to share in the wealth of 

knowledge and resources these institutions offer. Most universities offer ^ 

packages of programs for statistical analysis. Moreover, many professors are more than happy to have 

\ .' " «;« : ' "~ . 



"real" data for studeirk^to analyze. A call to the chair of psychology, sociology^ health education, industnaj, 
| engineering, social wd!rk\or statistics can sometimes lead td_Sn arrarigimerit for analyzing data. But be 
sUre^ybur data heeds geyhet— riot just theirs.' , * * 

The university as a resource is by no means limited to data analysis. University students can also seirye 
as interviewers, interviewer tfainers, data coders* obser^ 

Most social work programs ^_and_many social science' programs encourage or require their students, to gain 
field experience. An of fer by \he manager of an opp<?rtunity for such experience may be ;w^l$ortied by tb$ . 
dean or other faculty, but persuasion arid negotiation will be necessary. : 

Step 3— -Development of alternative evaluation d^igns^The manager is now well prepared to. develop 
evaluation design options. Here the services of a skilled ^valuator will probably be necessary. Before 
arriving on site, the eyaluator will want to review as much material concerning the program as possible. ^ 
The analysis of decisionmaking activities :and of- program activities will have generated a number ofe? 
documents: draft evaluation questions, revised program outcome objectives, a Data Needs Checklist, ana 
copies of current data collection devices. Cbpjes of these docum proposals, 
brochures, program work plans, and the like should be forwarded to the evaluator well in advance) of the 
consultation visit. ^ 

The development of evaluation design options involves two activities 

deciding on the scope of the evaluation 
and 

< . developing the design opt ions themselves. 

In general, the evaluator will take the lead rote jn both pMhese activities. Hpwev^ 

to remain an active participant to provide the evaluator with the information and ddta' needed,. as ; well as to 
. make necessary decisions. - . / . < ■': 

Deciding on' {he scope of the evaluation--~The scope of the eValuati^ " J* ' >. 

amount of, data collected and the elaborateness of the evaluation design. ?J From, the program manager's 
perspectives, scope will translate roughly into the number of evaluation questions that can bfe adcjrefcsed and 
the certainty of the answers .produced. Jhere * s a tradeoff between . the num6/er of questions and the 
certainty of the answers; The manager yvill need to consider the uses, of the evaluation information to • 

balance these two factors! 

. r v • .; : . : _ . 

The evaluator will take several factors into account in helping he manager determine the scope of the 
#vjrtuati0ii.__ TO the program's ; readiness for evaluation, ijs current data collection 

. methods^ and its resources for evaluation. After reviewing the program's materials, the evaluator will be 
able to give a rough assessment, siich as, "We should be a6le to do a thorough job on the process questions, # - \ 
but we'll be somewhat limited in our ability to measure effectiveness for all program components." Taking 
off from this roqgh assessment, the evaluator will then specify exactly^which evaluation questions on the 
draft list dre to be included, and which postponed or dropped. 



Almost invariably, the draft list of evaluation questions developed byjtfe manager will exceed the scope 

possible fbr the agency. Accordingly, the manager arid the evalUatbr riee^ to prune the H^* As Patton 

(1978* Pv 137) notes* the usu^Usoltrtion to this problem is to rank'ttegOjals of the evaluation in terms of their 
importance, Patton further notes, however, that priorities set in terras of importance may not result in the 
most efficient use of limited evaluation resources (emphasis in original)* ; 



The fact that a goal is ranked first in importance does not - necessarily mean that 
decisionmakers and information users need information 'about attainment of that goal more 
than they need information about a less important goal. In a utilization-focused appro a c h^to 
evaluation, program goals are also prioritized tay applying the criterion of usefulness ^of 
evaluative information. . . The ranking of goals by the importance criterion is often quite 
different from the ranking of goals the usefulness of evaluative information criterion, ; 



A key reason that importance and usefulness yield 4 different priorities is that the most important 
prevention program outcomes are often the most distant and difficult to measure* So, for- example, the. 
most important outcome of a smpoking prevention program ; ria^ be a decreas€4n the ^prevalence of chronic 
disease. However^jthis outcome may be impossible to measure. Measuring a less important, intermediate 
outcpriie (e.g-f being able_tp_refuse a cigarette in a socially acceptable manned) may be more useful, to 
evaluate «and improve the program. 
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>ther reason is that^the_ manager. jriay_ be;;abie to obtain high-quality information without using 

Fon Example, a sophisticated sbciblpgicil study of classroom 
climlle is unnecessary if the' manager can get all trte reeded information [by visiting classrooms and sgeaking 
with teachers. This is not*to suggest* of course, that such a study may not be necessary tinder other 
cirfcujtistances for other prograrqs. * ^ ' , * . 



Working together, the evaluator and the manager will refine the draft list of evaluation questions to 
bring the most useful areas of evaluative inquiry into focus. Several different lists may be developed and 
measured^ against the .^cope o^ the evaluation tha^ the^ deems feasible. In the ideal case L the 

information users and decisionmakers who helped develop the draft list wili£be involved to some degree in 
this process as well. Minimally, however, the final list of^fevaluation questioh^should be reviewed by these 
individuals before the actual implementation of the evaluation. 



r 



fevelopment of design opt ions .— When it is time to develop evaluation design options, the evaluator 
may wish to work offsite, dloser to resources such i as P_P§rsonal libra ry arid cOTeaf^ues. __ While the mariner 
may view this as 5 loss of control over the evaluation planning process, it can reasonably be assumed that 
input to this point arid thejcefined list of questions will guide the evaluator in appropriate directions. In any. 
event, the manager wiITOiave an opportunity to review the fevaluator's 'design recommendations arid assess 
their adequacy in meeting information needs. • . 




Chapter 4 has described in detail the issues the _e valuator faces in designing ari eyaluatiori. Here let vis 
briefly Review these issues in the context of developing evaluation design options.. Basically, the evaluator 
^ill proceed by resolving three issues for. each of the evaluation questions on the refined list. . ' 

Type of information . — The first, and in many ways most, basic,- issue is the type of information each 
evaluation, question requires— description, comparison, or explanation (cause arid effect). Each of these 
areas requires different evaluation strategies. ~. 

Descriptive questions ask -such things a?, who, what, where,* when, and how, tfnd are most often 
Associated. _with i process evaluation. An example of a descriptive question 1 is, "HpW many boys versus girls 
attended the alternatives fair?" While descriptive questions can and' should be answered with great rigor, 
they do not require -elaborate research designs or sophisticated statistical analyses. *- 

Comparative questions ask about the relations among variables without assigning causality. Such 
questions often concern -the relationships between characteristics of the participants (age; sex, risk status) 
br _ characteristic (expertise, trainirig* enthusiast^ Ah example of a 

comparative question is, ,f fe rock climbing a more effective prevention alternative for boys than for girls?" 
The evaluator may choose to incorporate such questions as formal features of an outcome evaluation design, 
or may choose to study them more, naturalistically, capitalizing on naturally occurring variations in the 
factors of interest; : >• 0 7-. : ■ I 

'.■-'*• ■? • * * ■ -i .'V'i'S . , m ■- , 

Explanatory questions concern the extent to tyWch^ changes in the attitudes, 

knowledge, and/oc behavior of the •program participant^ Md oth^reiv^^ type are almost 

always addressed by evaluations designed to rule w out . alternative etfptena£i$n£ |or :th^ Changes; observed. As 
explained in chapter a humber_pf design bptiqfls exist which vary W / 
thus supporting the s claim that the program irresponsible for observed outcomes. Often there is a tradeoff 
between the^extent that a given design option can *rule but alternative explanations, and the cost and 
difficulty of that option. . - . \ . ■ ; 

Type of measures .— For any given evaluation question and for any of the three information' types 
(descriptive, comparative, and explanatory), the evaluator can choose from a wide variety of measurement 
techniques. These include observation, various types of interviews (structured and unstructured), question- 
naires, psychological tests and measures, and reviews of archival records** ; 

/ ■ * : ■ 

In making initial choices from among -these optiwis, the evaluator will be guided first by the: specific 
question to be answered.. But considerable weight must be givfen tq^the appropriateness of the measure for 
the specific target population, th£ expertise necessary to use the measure* arid the ccjst of _ the 'measure. \v 
Wherever feasible, the egaluator will wish to gather data concerning a given question in Wore tfian brte way. ( ^ 
Overall, the evaluator^rill attempt to maximize the quality of the data while minimizing cost and disruption 
of the program's day-to-day activities. - 

* 

Who will be measured.— It is 'almost a-truisguttiat the larger the sample obtained in the evaluation, the 
.^bfe accurate the results will be. Hbwe^pJ^the law of diriyriishirig returns (sie, for example, Ha^s arid 
Winkler 1971) applies especially. when resdtfrees 'f or evaluation are limited. In many ways, the^eative use 

>s67 • I 
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^ y^°y s 9 ^ m ^ n ^ techniques is the e valuators most powerful tool for m aximizing the resources available. 
The evaluator ,may also need id overcome such obstacles "as school-imposed restrictions on who can be 
measured, and issues of informed consent 



- The tradeoff , in this case is between the numbers/of individuals who_can be \pieasared and the scope, 
flexibility^ arid sensitivity of measurement. For example* a rriailed^uestiohhaire can reach large numbers of 
individuals, but an exploration .of nuances in meaningMs lost. Altewiatiyely, small numbers of, individuals 
may be measured in great depth apd with great elaboration, but the cost of such an option may preclude 
measuring a samjple large enough "to be representative./ •. 

■4T . / 

Overall, the evqiuator will develop various '^combinations of measures, samples, and evaluation 
strategies. Now the manager and the evaluator face the difficult task of choosing among these various 
design optiops. . 7 / 

Step 4— initial selection «(f • a design.— In choosing among various design options, the manager will 
perhaps confront the major tradeoff in the entire evaluation planing process: : striking a balance between 
the usefulness of the entire AvaluaFion v ahd the ^faount, of .Hollars, staff, and tftfier resources that can be 
*' committed to it. UnfortUri^/tely, resources spent on -evaluation are often resources taken, awayf from the 
services beih£ evaluated. 

Happily, much of the evaluative information that is most useful is also the least expensive to gather. 
Qf^jiiJJi^rfifirted'lJs^ will be somewhat weighted few process evaluation* and the 

manager may wish to choose a design option emphasizing the process level. 

Of course, all prevention program managers must concern themselves with outcomes,- but the kinds of 
data derived from a sophisticated randomized experiment may well be unnecessary for decisionmaking. In 
some cases, qualitative outcome data may be v stlfficient, and in many cases, a relatively unsophisticated 
outcome design ^villj^e all thaVthe manager r^Ulres. . 

In any event, the manager shoulcj quiz the evaluator extensively about the strengths and weaknesses of 
various design options, and the strength of a given option should be measured against the importance of the 
decisions to be made based on j_the data. L_Cfer_tainly the manager will not want tcrbase major decisions on 
. weak data, but neither should precious resources be expended on a rigorous study relating to a relatively 
trivial decision. The prioritization of evaluation questions can be used to guide the differential allocation^ 
resources in choosing among design options* . v; 

: One final consideration in choosing among design . options is the ease with, which important 

constituencies such as funders and legislators can understand the^gesign. .Designs vary in their Jhtuitive 
appeal and^the'simplicity of their logic. Instead of a tempting flashy new technique with fin air of scientism 
and high* technology, choose the simplest design possible that will meet information needs. When the tinie 
comes i to djsseminat^ the flashy design with its complex "logic and statistical 

analysis maj^be a deterrent to clear communication. All else being equal, the easier an evaluation design is 
to describe and understand, the gi^ter an asset it will be. ' 

Step. 5— Operationalization of the design.-;to this pojnt, the manager and the evaluator will ha vie been 
dealing essentially in abstractions. ^However, an evaluation becomes a specific set of activities, performed 

.'JfTJSiP 9$ individuals^ according^ to a_ ^ detailed workplan. In operationalizing the design* pragmatic 
considerations*' are primary. The myriad practical constraints associated with implejpenta^ 
evaluation must now be considered. The evaluation design may have, to be altered tb fit the operating 
^ context, but generally this task is one of working but the details. 

.* -'Program staff are particularly important actors in this phase of evaluation planning. They are thejones 
most likely to know whether this or/that evaluation activity can be ^om fortably incorporated inter tfcfc_. 
programs operation. They may also be*the best resources in tetms of the ability of the program participants 
to respond to various njeasurement devices. For example, an evaluator may* plan to use a particular measure 
_^of drug knowledge that the program person can see is above the reading level of the program participants. 
-f- ^Because program staff will be partly; responsible for various aspects of implementing the evaluation, their 
involvement in the design will help build ownership Anti enthusiasm. 



Two of the rriost important ta^cs at this step of thfe evaluation are 



t entl 
hfc evi 



o selectiorfahd development of evaluation instruments^ and 
o development of detailed timelines and workplans. 

■' -. • • '. fX . ...... s 
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in general, the evaluatbr will take the lead in operatibrializirig the ~. evaluaUqn, ,_plan. ' Hdw^vec, 
involvement of the program manager and [staff in this J^ase o/ eyalua planning is crucial. Unless t^e 
evaluator is very familiar with the program and the community (and most will not be), the evaluation plan 
may lack sensitivity to n^vailihg community values and may require activities difficult or irapqsstbjte in 
light of the program's day-to-day operation. % 

Selection and development of evaluation- instruments .— Almost all evaluations require "some 
measurement instruments. Reports of behavior, behavioral intentions, knowledge, attitudes, and 
psychological variables are all regularly assessed in prevention evaluations.^ In some rare instances, the 
selection of instruments will be a happy task -of wading through several dozen choices 1 (as is the case for 



self-esteem measures for white, middle-class yolith). 
that are appropriate for the target population; 

Though difficult, the process of instrument dev€ 
noted earlier, several compendium its 



Often, however* few if any publisjped instruments exist 

- * / 

* . • ,;; 

W' ■ *i 

S>pment need not. present Insurmountable problems. As 

is for prevention; sisaluartion currently exist and host 

^valuators have had some experience in the development of instruments. * r THe use of newly developed or 

revised instruments wilK'of course, require additional time for pretesting and revision (see step 6 tfelow): 

Suffibe it to say, this time will be well repaid in. the quality of the evaluation data. 

Ultimately, the manager, jwogram staff, and even program participants are in the best position to judge 
the appropriateness of a given instrument for their community. If the instruments suggested by the 
evaluator seem inappropriate, the manager 'must consider revising, them or developing entirely new 

_ measurement techniques. Failing to, do so ..risks the quality of the entire evaluation effort; doing so 

% increases costs. > . 



1 Developfligptfoft detailed timelines 'and workplans.— Of ten the- role of managing the evaluation will fall 
to the program- manager or a staff member. Logically then, the manager. or designee should take primary 
responsibility for mapping out an evaluation* workplan. Ideally, the evaluation will be managed using the 
same techniques as other- agency business. If formal techniques are employed for program management, 
such.as Management" by Objectives or Gantt charts, these' should also be employed to develop the evaluation 
workplan. in gejaera*, however, the key issue is to determine in advance the various evaluation tasks, the 
necessary person pojver, the work assignments, and some method for* ensuring the timely/completion of the 
evaluation. Jn developing a v workplan for tfte evaluation, be ^ure to allow enough time for each evaluation 
task.. To paraphrase: Sn old saying* ■ ' ' , : 



the^sj^fih^e-quarters of the evaluation will take three-quarters of the time: 
\ " * '•V'^'"^*"'^^ i^maining quarter will take the other three-quarters.^ ;. 

The manager unfamiliar v^fe, evaluation activities may tend to underestimate thfe time that tasks require. 
■ An evaluator can provide u£e$ul guidance here, but a conservative timeline, that allots too much time for 
^ Various evaluation. Pa^ks, wifltiever be regretted, 

. ■ : ;.. ..;'■■>'_ J - - . ... 4 'H 

A second majpr^i^lie in developing "the evaluation workplan is to ensure that major activities, such as 
testing of partieipani^^egr*atr^i m es that are convenient, feasible, and' consistent with the design. All 
' too often evaluation pli^s schedule pretests during summer -vacation, posttests during the manager's 
• ^ vacation, and datarkhai^is whjle Jtie computer is tiedjup Wjth other business; Here, as elsewhere, the active 
^participation oftoj^farjjr.!staff in (ieveldpment of. th^valuation workplan can avoid problems and greatly 



^facilitate 




^Mentation. <> 
e evaluation 




The ImMeYnenta-tion stag e of cW%^^ 




_ *J$r >j 

'step^--fieJd t>st of v 'tll4| t 
' jSttffi 7--re^loi(vs resuHfh^^m tes^^ ^i^^; 
»f-step 8— cnJfeqtpn and analysis -of 

eVitlua^r^p^jd^i 



pr ogr n fife go a ; arWeula$ 
Tempt^b^;Anrf^£i_NQiJ 



components piilflfcing .^^r^d*i4i&^«wril!<^ 



f ^teps in the evaluation process: 



^a^pf the evaluation have been established, 
eai^and measurement instruments selected: 
mediate implementation of the plan. However, 
iuations" essential, ib field test the evaluation 
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A field test is at ^actice evaluation; A small sample of service recipients will be involved in trying out 
thfc gueffiq^ Data will.be analyzed, and ^esentatibnlfbrmats exaridned. Thejmrpose 

is i to determine whether tte plan works. The Handbook for Prevention Evaluation (French arid Kaufman 
1981* p. 19) says this about field testing: 

AJ1 aspects of the including sampling* measut , es i data . 

collection plans and analytic procedures, and utilization activities. The pilot test determine 
(sic) whether the data collection schedule is feasible, if the collection can be carried out • 
wth niinim^ if the data being collected are valid, whether 

the varia&leflf are reliably measured, if the costs r £rf data collection and analysis are on 
target, and whether the resulting information is used as intended by the decisionmaker. , ♦ 

" ; ' ... ** : ' . L j 

This comprehensive order can be broken into three basic components: testing the design, tesffig the" 
process, and testing usability of the data. The design may can for providing certain services to some., people 
and something different^ to others. Certain types of data will be 'collected. The field, test shpws /if the 
design works. Can the procedures be applied as planned? Will respondients be available and cooperative? Is 
the idata analj^able if collected in that manner? 

Pretesting the planned process rna^ prove that questionnaires are too lengthy or ambiguous) 5 
psychological measures mvalid, jjt anticipated file data too sketchy. More extensive training of interviewers 
may be required. Pockets of resistance among, the staff may surface, and everything ma^ take ; longer than 
anticipated. ' ' - . . ■ <«V 

■ . ....r t .f' ' v ";*f . :,' - _ : ; : ;. ' /; • " v 

Finally, a field test should help clarify whether ^valuation data will be useful. Will the types of results 
answer the questions the manager wants answered? if not, the evaluation will riot serve its full purpose. 



. The manager may reasonably expect that the evaluator will be expert in determining how extensive a 
field t£sf is needed and designing an appropriate one. The role of the manager in the field test includes: 

o assessing the value of field testing . ; . ■ 

q f pa^ticipa^iife I'nj5lanning a useful test _ _ v . v , 

>— conveying tp th£ staff an3 relevant others the need for a field test V 
o ensuring lij^Mrces^and cooperation necessary to complete the test 

d hel^ng^r^iew test _ results with ah eye toward those aspects of th^ evaluation over which the 
program manager has control 

working to effect any changes needed in the evaluation design. 



The manager's most difficult role may be enlisting the cooperation of the staff, who may consider the 
evaluation itself sufficient nuisance without needing practice first- The manager^ attitude arid appropriate 
involvement of staff in previous phases of the evaluation will be the best levers^iri obtaining staff 
cooperation. .* . / 



St^ 7— Reyisidns _r^ltiag from 6f the field test is to perfect the evaluation 

plan, eliminating such bugs as may be fourid;y£6r example;, service recipients in one program were asked by 
staff to submit voluntarily to interviews. As a result, the volunteer rate was quite low.. Staff resistance 
proved to be the problem, arid efforts were increased to bri rig staff into the evaluation r ^6ce§s. Another 
evaluation required correlation of pretreatment demographic, variables with posttreatmenf behavior. Field 
testing revealed deficiencies in pre treat merit data gathering, whjch were corrected. 

In a third case, field test results included an unexpected negative correlation between treatment 
conditions and posttreatment attitudes of Hispanic clients. The problem was found to lie in the translation 
and interpretation of the Spanish-language questionnaire. 

The^e, examples indicate the types of problems yvhieh can be spotted through field testing and that 
require the active involvement of the program manag^rt r Each example involved 1i condition the manager 
wotiid like to avoids such as antagonizing cliwtsAa prpblem that could reasonably be handled, such as poser 
records arid staff resistance; or a problem th^J^ened the value or usability of results. 

Other problems of evaluation design, te^rot^J^gects pf data analysis,* or problems ihltetrumentation 
ire legitimately within the domain of the eVaj:ciato^f7; • % ' ' ^ * 

■_; : * - • • •• ". ; 

, Step 8— Collection and analysis of data;— This stagjejias thrge substages: implementation, analysis, and , 
nt'erpretation. » ' 
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1 m pie rii e n t a t fori i . At this point the evaluation is in progress. The bags have-been worked but, and the 
procedure smoothed. TTie manager's role now is to monitor the process, to ensure that the. evaluation is 
P.f)j}?. . ^. n ^R cte ^ L ** s JL^"" 6 ^ 80 ^. ^^ Program VS_ .services con tm Ue to i__.be . delivered _ without significant 
alteration or disruption. Clearly* not only those evaluatiort activities under direct prograrmcontroh such as 
interviewing clients or differential client treatment, but all evaluation activities should be monitored. 

Analysis .— This is a fairly mechanical stage in which the gathered data are analyzed. The analysis may 
be as elementary as frequency counts or assophisticated as multivariate statistics, arid the responsibility for 
cojiducting the ah fLhalysis arid the format in 

which results are ultimately presented should- have been decided tipon much earlier in the process, tried out 
during the field test, and should have the manager's concurrence* 

^ jj i_ _■_ * • r 

Interpretation; — Each of the nine^ste^s b„eing discussed is dependent on the success of the preceding 
steps. However, this substage _has a Jjighjdtegree of independence. Even the most clearly phrased question 
™Jiy yte^ ^riswers may contain riot a clue as to explanation. The 

presentation or warding of results can affpqt how, results ar'e interpreted. 

In one instance, a school-ba^A 'decision-skills program for preadolescents was found to have no 
Measured impact on later drug use^ / Pnis fMlurejrjiay have been du% to improper program implementatipn by 
t-Hffrstaff, poorly trained or inexperiene^^ersdtaiel, or application of the program to the wropg population. 
Or perhaps it was just a bad icjea^ Whrft»h of tht|A*possibilities should be discussed and&r emphasised in the 
report? How should the results i& J^^gnJ^^jf^h'o.gets jo make the decisibjj& These quesripnjg^ill be of 
definite consequence to the m e^j&ger^p 'v^njL, - **' J- ■* ^_^f^ *r( 



Further,, suppose the prpgr^wajgsho^m to have led to a lS^^^ercenT.reopetion in later drug use. 
Consider the different interpretation^tfat would attend the follojftFfng sta^emefitii/ 



||^jt; would attend the follo^fng state m e 

The program yielded onl^ a 15^5 percent reductjcjjb * 
The-program yielded a V5^^p^tjeni redyation. ' xA^t^ '^ * ' ■ * ^ 

The program 'yielded a reduction ofyover^ 15 percent* AirK? ' * ' 
-. „ ; ' * ' > V ' j?'- — 

; , : Or, '.perhaps the' program was shown .*tjy . iessetT drug use >k but §j$gram recipients rated the program 
negatively. Consider the difference in emphasis between these stajeoj^rits: 

•• ' / " • . : v , • ■ .... ; • . " 

" Although program recipients tejided-hot 'to rate the program favorably, they f 
did show a significantly lowerrate' of Isfabsequent drug use. 

.. ' :i~y f . * - 

Although a ^gjiificant ^reduction in subsequent drug use was demonstrated, 

program recipients rated the prpgpam negatively. 

*«. — — ■ ' * 

* - - * 

The consequences of interpretation Will generally be felt in one of two ways: decisions internal and 
decisions external to the program. Jh the first case, decisions to change or riot change programs win be 
based on interpretations of results with emphasis given to some results more than others. Interpretation and 
emphasis'/may stem entirely from the evaluator, be left to Jhe manager, o£ jpjntly derived. The manager's 



SIS*] 

goal is y> make or receive as accurate as possible an interpretation to make the -"best possible decisions. 

It may be thai the locus of decision lies outside the program, perhaps with the funding agency.* Funding 
sources, pf course, deserve accurate interpretations. Program managers will be legitimately concerned not 
only /with accuracy but mth the political and economic context within which decisions L_wHl^e made. When 
the context places the program in a vulnerable status, managers will prefer some statements to others. 
"Only 15.5 percent" and "15.5 percent" are equally accurate information but differ in connotation and may 
lead to different decisions. The argument here is not for skillful deception 1 but for decisionmaker 
involvement in the form of data presentation and jri the interpretation of results. 4 ^ 

: , .... ... .. . - ... 

Step 9— Utilization.^ results.— Sometimes evaluations have to be done protforma; the fact that they &fe 
done is sufficient, with no requirement, expectation, or hope of .their use- Ideiklly; however, evaluations will 
be used, and from the outset conducted with ultimate use in mind. Chapter' 10 of the Handbook for 
Prevention Evaluation contains a discussion of factors important tp the uses of evaluation. The core of its 
message is to . .. . .... _ . a 

build utilization into your design from the beginning. 

T :£ ; 

Davis and Salasin (1975) cite a collection of articles on critical evaluations of Federal programs.. In 
each case, the evaluation was forced bri the recipient agency by a superordinate agency arid was designed to 
meet the lattpr's n^^ds. And in each case, the managers of the evaluated programs spent their energies 

- j ' -62 7, 



criticizing instead of using the. evaluation. M Utijiza tibri, M _ Davis and Salasin (p. 623) note, "may be more 
apparent than real tyhen mangled by jiuthGrity. .. without collaborative irivblverrierit of the people 
representing the program 



Patton (1978, ^•_6ij__maXcX^?lPP'? 1 ^ l . nSt . "P.eppje, hot organizations, use evaluation information, 1 ' arid 

reemphasizes that the intended users of an evaluation should help plan it; Patton's survey of Federal' 
decision n-iakers indicated that two characteristics influenced the use of evaluations: political and personal. ; ' 

Political considerations are essentially, external to thfe program, involving social issues, budget cuts or 
growth* or iarge-scale social program success or failure, these issues sjie discussed further in chapter 7. 
For noy^ it is useful ;to recall that a prografn is often the _cesu.lt of a political process arid its evaluation may 
be part of the's^me or' a -new political movement (Weiss 1975). Although evaluation is a scientific process in 
search of truth? iLdoeg^bt always avoid fighting and is often lalso a method of fighting within the political , 
arena (Lindblom tgfeST. * 

Thus, a community concerned about drug use may value the existence of a program more than & 
scientific demonstration of its success. Elected officials who helped initiate the program thus might pore 
through an evaluation^ looking for words of praise and ignore pages of critjcism. Or, in times of decreasing 
public budgets and general disenchantment vyith human service programs, an evaluation finding only 
moderate success may fie read as a condemnation i of the prbgranri for not fejng perfect. However, an. 
unevaluated program may be able to prove nothing about itsj^ except. its existfiice, arid thus is vulnerable 
to any attack weighed against it. JQjp " ; 

* • . -. 

Whatever the political climate, a program manager has to work within it and may have Little or no 
Impact on it. Thus, the second of Patton's two critical factor's, personal, will usually be a more" appropriate 
focus for the manager. By personal, Patton (1978, p. 64) means "the presence of an identifiable individual or 
group of people who personally cared about the evaluation and the ihfdrmatior^.it geriera ted". " ^Jffieri this 
factor is present, the_ evaluation is more likely to be used. Consider this statement, made by anevaluator 
surveyed bv Pattdn (1978, p. 66): 

\ • ^ ^ ' - 

> Where there were aggressive program people, they used evaluation^ whether ' • : 
: they understood them or nop— used it as leverage to change . .' * his program. 

.______\„ ; , ; ■_ ■ ______ " j k 

- Another (p. 67) said an evaluation wastised "because the decisionmaker was the gay who requested the 
evaluation arid used the results. It was the. fact that the guy who was asking the questions w§s the guy who 
wa ?_ E9 1 _ft _ * ° PI a k e use of the a ris w e r s . " Use o f t h e ev a 1 u at ion will e mp h a t i c a 1 1 yj» £>e ri d on t h i s pe r son a I 
factor, most often that of the manager, whose involvement from day one in- nil suSfs will seethe stage for 
ultimate use. As Weiss (1975, p. 19) said, an evaluation "is most likely to affect decisions when it. accepts 
the values, assumptions, and objectives' of the decisionmaker." / 

While the primacy of political and personal interest is acknowledged* other factors do contribute to fhe" 
usability of evaluation. G laser and Taylor (1969) compared unsuccessful with successful evaluations and 
found the following contributed to success: *. 'Jf~ f ' 

o from the beginning, high involvement of relevant groups inside/outside the^rganization 

o study designed by a full-time principal investigator * ^ i 

o com mitment of the host agency - ' ^ 

o evaluation aimed at a felt need of the organization ^ . , • ■: 

o involvement of potential consumers of results ^ 

.d readily disseminated findings. ~~ 

. ' s ;. : . L« ; •' ■ - 

Patton (1978) reviewed the literature and listed other factors contributing to evaluatum-use: • _ 

' o methodological quality " • tt 

o methodological appropriateness " ^fc, 

o timeliness of evaluation ^ ^ *A - 

o timeliness^of the final report ' 1 

o whether findings were positive or negative t _ ^ 

6 "sui»prisirigriess M of findings— were results expected? ' -i* 

o whether central or peripheral program gbyJs were evaluated 

o (existence ^fj^eiEi^d findings elsewhere ; * + 

b ' : resources available I to implement changes . " k 

o cvaiuH tor-manager interactions; „ ■ 7 ;> r 
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Weiss and Weiss (1981) surveyed social scientists and decisionmakers to determine their views on what 
impeded arid prompted effective ujjlizatibri. Thfey found appreciable agreement between evalt|ators arid 
^^??l^onmakerS. .Some major impediments over Which managers have a hig+i [ degree of: control we^ 
tendencies fof: -V * ' 



6 decisionmakers to igngre inform own ideas 

o policies to be arrived at by politics, not research '< jfc^ 

b agencies to ignore findings contrary to their policies ^ 

b decisionmakers to have difficulty 'defining research needs 

o lack of communication between decisionmakers and evaluators. 



There were also factors that both groups agreed contributed to evaluation usefulness 





o topic of study is of. particular interest or relevance 
_b ,.study looks at variables that decisionmakers can do somejpfung about 
report is understandable/ not overly technical.] 

lecisibn makers* placed more emphasis than did evaliiators bri timeliness of the reports and on the* 
% of the user invthe popujation studied. EvjaluaTors [ "were more likely to be concerned with studies of 
"Social concern and with dissemination of information. The number of factors . is^artly arbitrary and 
jfentic. What is important is the relative value of each in a given situation. Note that none of these 
tutors arises i at the end of the evaluation. __Each l may be anticipated from the outset, and failure to 
anticipate them virtually guarantees failurb^of the evaluation, 

' f However,r : th^vcohverse is hot nec^pafily true. Anticipating the l future does hot g^rantee that the* 
future^ Will ^arme ^^anticipated.* Davis and Salasin (1975) advise^pn tactics_fqr effectively presenting 
evaluation r es u 1 tt^ftt&rt^eafljflps , and r changes which may result from them. They cite several important 

&Qrisid^atibn|j^ j ; 

6 The presenter,^ Able, to identify with the audience. * / .. 

b Essential information is repeated I arid restated oft en^__ _•_ 

o A combination of logical and emotional appeals is made, without exaggerating the letter. * 

o The benefits and risks of-ehange are made clear. - '■ _ . 

o Recommendation^ are consistent with the values of tgcigflents pf the presentation. «• ~ 

'-• fr^'o Objections are anticipated and dealt with. p »' , ^ 

'* 6 Fre^ expression of resistance is encouraged. 

jv|inageni^nt of change is a topic owtside the scope of this volume. However, the principles of involving. 
c ljey personnel f^om the ogtset and of iStelligjent^reparatiofi of results and recommendatiphs will lay an 

"'effective groundwork fop-jnjiking needed charigl: ' 

- ' . <■< ' W 

A final issue concerning use of evaluations is how to deal with negative results. There are many' 
potential reasons for negative results: improper concept, improper implementation, improper evaluation, or 
external factory beyond the, program's control. Sonie evalualton designs rnay help identify the .causes of 
failure, others may not. Occasionally, failure-is built into the program. For example, to secure funding, 
planners may promise more than ^ay be deliverable or promise to deliver results more rapidly than is 
possible. In such cases, the evaluation will* find that goals have not been^completely met. Independent of 
such contrived dilemmas, how^ec/^new^r programs often fail to meet even rational expectations. The 
recommended rule of thumb for fiftich qases is this: V. 

Programs must be allowed to ftfrL 



The appropriate response to negative results from evaluations of new programs is often not radical 
program" change, wholesale firings, or funding cuts. Rather^unfrenzied program introspection, heightened 
attention to implementation procedures, and renewed coordination with the community may enable 
programs to overcome failure. * Programs not allowed to fail are, not allowed to grow, change, or adapt; to 
take risks and be creative; or to npeei intended needs. L * 

In sum, utilization is the raison d'etre of evaluations. Planning for utilization should be an integral part 
of planning nil components of the evaluation, from the initial stages of identifying questions to the end stage 
of presenting th<y answers* *j ^ *5 
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CHAPTER £: CASE STUDIES IN PRE VENTTO^I ATION 

< y 'l (What Really Goes On . . . . inside 
a ^Triple Feature) 



) 



4 ^ ■•■ i AN OVERVIEW 



j^^^l^bugh these case v 3t^^te|^esent a slice bf_ evaluation lifer the reader should understand that a 
rrfijN^ strategies, $h<} issues occur in an\ actual evaluation. 

Hbw^^ *he hnat£*ial pres^t^ dtfte capture the essence of the evaluative experience. The stores are 
*e_ntitlfit^^^ Discussions, and One Suspenseful Melodrama. The dialog jit. times 

is Ijg^tHeftg^^ each case study is essential to the theme ol this volume-good 

evaluations dcciir wfien progjeffrr managers and evaluators work cooperatively on an evaluation. 



DOUBLE TROUBLE v, 7 .. 



Alternative Designs for Alternatives Prdgrams % S 

..... __ "_, • i * * _' ' _ ' " 

T^ 16 Brightside _YoutH_J3ent_efj located in a major mtd western city, was established 7 years ago to: 

provide prevention and intervention service to troubled yputh. It is housed in a community center and 
currently delivers services in two ^ broad areas: drug and alcohol prevention services i i n the pyblic schools, 
and ^ program of social and recreational activities for youths front 8 to 18 jtears of age/ The Brightside 
staff consists of i 2 ppopie, most of whom are counselors social wbrkere. Their funding conies from a 
mixture of State ana local drutf and alcohol prevention grants* and Unite^jVay jwpport,<™ppiemehted by 
small amounts of private donations. * «• - ^ f 1 * 

Donna Campbell is, |h# directpj- of the BrightsWe Yoot^pe^ter, a position^ she has held for the past 3 
years. Two other staff members Jim Cfeok, *e assistant Are#brs in/charge of th<* 

drug and alcohol preventi^^omponent and the social-recreafio<!f^<5tivitles component, respectively. 

^During the pas| several months, Donna, Joanne, and Jii£ta»diamisaed their ne^ds .for evaluation, of 
the Brightside programs. Although none has a faekj^qund in^lv^llffllton (iii facU i< tney_ have, always been 
pretty resistant to' the whole, notion), -they recognize that #\eit flindin^ agencies ^ ^^ncreiftsjilgty asking for 
Valuation information of a fairly sophisticated nature. Moreover, Donna and b*r jta!? hay* Recently begun 
to \beiieve /that perhaps some evaluation mfsfht help to _id||itijte|nrbc4 effectively Jtfe atrangihs and 
weaknesses of the Brightside programs. So a few *eeks fgi^nnaf called ^ 

Evaluation Resource Network JNPERN) to ask for some techitWatl assistance to help fSenr develop ah 
evaluation strategy. NPERN respon<*pd to her request by afrapging fbi> a consultant skilled lj\ -program 
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r ~_Thfc three hypothetical ckse studies in this chapter are intended to emphasize the realities of the 
valuation process as experienced by prevention program managers, staff, and evaluators. 

. ■_?*?? > J5^ s^udifes present p programs' at different stages of development and reflect var08$ ^ 

prevention mpdaiities.^ ^ p'f -the evaluation process described ,ih 

^eviqu^ own unique motives and primary audiences for the results of the evaluation* jln . 

these ;*fise v , stuffing Hhe fofegactftns between the program managers and the evaluators are the most 

;^^if icttnr^ec* of ttre ria^^|ves. « • ' ** 



[y with the program; 'The consultant; Ron Fish^r^is a. research psychologist 
who specializes in the evaluation of drug and alcohol prevention programs^ 

Ron and Donna talked briefly on the telephone about the purposes and f undtiBfTs of the consultation visit. * 

During Ron and Donna's initial. meeUng^^i^fe^f jCe, they (liscurap basic mattere relatmg ^ 
Center's organization and history (objectives* ^f(Sgf|fetterhs, an;d the -like). She also shared her motives 
for the evaluation with Ron, at which point he expressed pleasant surprise; 



"You mean you're not under heavy outside pressure? That's as rareas someone going to an alcohol: 
counselor on their own initiative." 

Joanne joined the meeting as they began analyzing the functions arid activities of the drug and alcohol 
prevention program. Joanne described the pr^am^ major activity as the provision of broad prevention 
services to two large [.high sqhools and three junior high school^ri ih^s6utH side of the city. (The south side 
population is 24 percent Hispanic, 28 percent black, and 4* pfercent white, mosj^ second and third 
generation Polish and Italian.) The Brightside staff conducts^ sem^ter-long classes^these i sdtools caUed 
\ Positive Directions for Youth, which includfe sessions on interpersonal eommunicStiofliftr stress management, 
self-conaefrt %< family dynamtes, and 5rug and alcohol use. Teacher^cilitators assist : the Brightside staff in 
the conduct of the closes. Approximately 20 percent of the student population is assigned to the classes; 
plans call for a gradual expansion of coverage to include the entire student bodjf eventually. * 

As we look in on the meeting, Ron is about to discuss potential evaluation designs with Donna and 
Joanne. - h . 

"I think now I've got a pretty good idea of how your drug, and alcohol prevention program runs, its 
goals, general strategies, and so forth. So I think we're ready to start talking about soifie possible evaluation 
designs you might want to implement. How's that sound?" Donna and Joanne look <at each other, then at 
Ron, nodding affirmatively. 

'I _j f ■ ■ _ ■ _ V - i Is. 

"BeTbife we go on," Ron continues, "I hope you had the chance to read NPERN's Working With i 
Evaluators . 1 Not only can it save time in defining terms and the evaluation process, but one of the case 
studies in that monograph bears a striking resemblance to your, program and, in fact, with our discission so 
far." Everybody nods vigorously. - ' ■.-—'■! 

"OK, Very good," Ron goes oh. "Nqw^as you might know, there are two basic kinds of evaluation-- 
. process and outcome. With process evaluation our first interest is an accurate documentation of what kinds 
pf services and activities your program actually engages in— the exercises you use in the class sessions, what 
the kids actually do, etc., and second, who receives! the program serviceSr-t he types of kids who are in.the^ 
program. With good documentation you can go on to more sophisticated process analysis. On the other 
hand, outcome evaluation is used to- — " 



"Hold it please, Ron," Donna says, smiling, but with an upraised hand as though stopping traffic. 'This 
is all pretty new to us, so let's take it one step at a time. 'How is 'process evaluation' useful to us?* , 

"I' m sorry," Ron grins sheepishly. "Please feel free to stop me and ask questions whenever ypuVe not 
sure of something. Well, process evaluation can help,y6u in a couple of ways. It can be a management tool 
to help you keep track of what is actually happening in your program and what your client population looks 
like at any point in time. This kind of information can also be used for annual reports, reports to flinders, in 
grant applications, and so forth, to show external funders-and agencies What you are doing- -and that you 
have solid information, about what ybuVe doing. It's pretty basic stuff we're talking about here, the kind of 
documentation that^ to some degree, fevery program should have. And, of course, that lays the groundwork 
for cost-efficiency and other more complex analyse^." , 

"I see," Donna nods. "And outcome evaluation?", * 

'''i*; . - .. : ; . • , 

A "Outcome evaluation is designed basically to assess the extent to which your program is achieving its 
n&jor goals. In your case, Joanne, outcome evaluation would attempt to determine how well your program - 
actually prevents the use and abuse of drugs and alcohol among the kids in the program." 

■* v - - 

"But we address more basic issues pf adolescent adjustment in our program, not just drug and alcohoj — * 
use." Joanne asks, "SfiSuldn't we assess program effects on such dynamics as_s(^f-est$eitfc communications 
skills, and so forth?" ^ t * ■ 
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• - - >. • - - t , , 

- "Most definitely," Ron Replies. /pirtcome evaluation /should address those objestiyjgs that are usually 
considered intern^ alcohol abuse, including attitudes toward drug 

^D^L^j??' 1 ?] .However, it's i importalit to keep in mind that for a drug and alcohol abuse prevention 
program, the (focus of outcome evaluation should remain on drug and alcohol use;" 

"I understand that," says Joanne, "but I also know it's difficult for. a prevention program^ to show 
evidence of effect on drug and alcohol use in a rather brief time period. 1 don't want to pin the entire 
assessment of bur program's'effeetiveness on behavior that even we feel won't show effects for some time." 



"I agree completely, so we'll probably build several levels of measures into our outcome evaluation. 
But we 1 re getting a little ahead of ourselves, bet's first talk about the gener«idesign; and then .we can get 
into the specific aspects of the outcome criteria. Shall we talk about the process evaluation first?" 



"No, J'd prefer to talk about the outcome evaluation design possibilities first," Donna suggests, "if 
that's OK, Ron--that's the one t|iat scares me!"" J 

"That's fine., Now, as L understand it, the students who attend the Positive Directions for Youth (PDY) 
classes are a cross section of kids selected from a larger pool. So you are taking only a fraction of those 

• ■ > 



studenjts who arQ^eli^le,' right?" 



"Yes* that 1 ^ right," Joanne agrees. 

"Can we identify a pool of eligible kids approximately twice the sizg of the pool that you will select: 
for the classes?" Ron asks. . ' •.' 

----- ----- « I ' 

"Yoa mean at each school?" W '. --.A 

"Yes." _ * 

\ ; , 

"1 don't see why hot," says Joanne. ;.. • 4fr - * 

. «__.__. ■_ ■ L _ ... • 

"In that case, we might have an opportunity for a true experiment— which is a very powerful outcome 
evaluation design,* 1 Ron points out. ' , » £ 

"Sounds pretty ambitious ... an 'experiment,' "-Donria interjects. "How-does that work!" 

h "Well, let's say that at a given school we identify maybe 100 Rids Who are eligible for the program. We 
then randomly assign them to either the PDY classes or to a control group— whatever class or condition they 
would otherwise be assigned to." - . I 

'■'I "What's the advantage of randolm ps&ignment?" Donna lboks a bit skeptical. . _ . j 



, ^^Well, it's just the .best way to insure that we come as close as possible -to haying equivalent groups to 
5OTip£ire, that the kids in the control group will be as much like those in the PDYiclasses as possible, in 
t'ermfi » pf background, motivation, and so forth." f \ 



"And. . . " Donna prompts 



: "And so when we compare them on outcome measures-— their attitudes toward drug use, communica- 
ions skills, etc^— whatever differences we find can beattrjby^ the 
eason for the differences is that the PDY group was smartgr f >or Better motivated, or^jghatever;" 



"Do outcome evaluations always use^ random assignment?" jJqanni ^SJcs. 

""No, not at alljCJRon explains. "In some instances, program staff foay provide sepvrces t<5* virtually all 
5 ^8lW^-j" c l^ntis;-^i^vT|K- no_ clients_ to_»assign _to_a control group. Or the program staff may have strbng 
eeiings" about 'denying' services to anyone— although that kind of stance occurs less often wijth prevention 
urograms than with intervention or treatmeftt-orograms, since prevention services typically are not aimed at 
•articular individuals who are clearly in heed of sbrn^ litfrnediate assistance." 

" x Lsee," says Joanne,' "But what would we do if we could riot randomly assign students to PDY or a 
yitrdl v group?" \ ; ^ - 3? ; j f 
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"Then we would probably try to identify a grpup^-a class in this iristance—that is as simllar t as possible 
to the PDY group arid use it as a'cdmpahisbtl group." " — 

"And collect outcome information on them at the same time as_the PDY group?" Joanne asks; 

' \ •'/;::.,_, _+ ' '- ^ - ■ ' 

"Yes, that's right," Ron replies. "Another option would be to coliecjt the outcome ^information on both--* 
groups at several points before, during; and after the PDY services are delivered. That's called a 'time 
series design,' by the way." v 

"But these strategies aren't as good €ts thecarfciom assignment abroach?" asks Donna. 



"No, they aren't, but they're' definitely better than no evaluation at all!" 

- • • * 

v < "''What kind of outcome measures should we use?" Donna queries. ^ 

"Well, the particular outcome measures' we use will depend on several considerations, including the 
objectives of your program, the characteristics of your clients, arid how much time arid resources you have 
to devote to outcome data collection."/ — 

"All that, huh?" Joanne smiles, looking over at Donna. $1 - " ; s 

f\ >\. . ^ . v. '* 

-"I'm afraid so!" Ron answers. "Aside from the aelectfOTT^f the design, therms iiq^oro critical step ' in 
the development of your evaluation than choosing your outcome measures. Remember, tray're the 
yarcte.tjcRs by which your program's impadt will be , measur«A * You want to make sure that they really reflect 
what you think "your program will achieve. And of course we want to be sure that they are valid and 
reliable— accprat^ measures of outcome." y - 

. .. , ., * _ ' : - ; -___}___ . x . - • 

: V "-. "Shall we start by looking" at our program's objectives?" asks Joanne. 

"jfeL Fortunately, you folks haive don^a fine job bfr developing realistic, measurable objectives.". Boh 
pulls out the hst of PDY objectives from the materials Donn^ had sent to him, developed as a result of her 
prior conversations with NPERN. "it seems to me that ,they reflect six general types of outcomes: 
substance use, including alcohol, drugs, and. tobacco; attitudes toWard substance use; self-concept; stress 
management; interpersonal skills; and family S^ynamic^. Is that accurate?]' t 

"Pretty, much so," nods Jokmie. "But^the *nterpersorfal area" should also include things like 
communication skills and reactions to p^er pressure." - ; 

^ ^ ___ ■ 

"I see.~ Well, some fairly good instruments are available for the measurement of these outcomes, 
although measUririg stress manage mc^ problems. These .instruments are designed for use 

-: ' with client populations of thersame age and grMeSevel that* PDY serves. However, we're sure to enqounter 
some reading problems, don't you think?" ' -? 

> 1 » \ - . • J--,-, .> 

y i'Yes,\we will," Donna answers^ "Perhaps 15, percent of the students at the junior high schools will have 
very low reading skills. Somewhat fewer ' at t^e Fitgh schools. How do we handle that?" 

"Usually we administer the instruments verbally. It would help a lot if these students were previously 
identified. Can we do that?" 

----- ; ' / ' 

"Probably," pys Joanne; "Let me check on that with school ^taff." ' 

- , .. ' . --, „ , - --■ - -■■ ■ . - .v.*. . . 

"Wh&'t about bther butcbrries Jikfc grades, disciplinary records, and so forth?" tftsks Donna. "VVe already 
tried to go through school records' for our kids, But the way they keep their files, TVs /practically impossible 
to hunt down dat* for individual students in our PDY program.", ' ' * 

"That's a shame," Ron says. "The more important question is whether there's reason to believe that 
the program wil} influence those indices, but that becqjnes academic since you can't get the data anyway." 

^iat 



.*- : "OK, rffew what about consent from the parents for the data well be collecting?" Donna* continues. 

«<' "Well/bbth the parents arid studerits will sign a form 4hat describes thp reasons for JheWata collection 
and thfe type^ff topics covered in the instruments— what^^^gii ^nfor^d consent.' ^pd discourse you'll 
nejgj to get agreement from the school authorities to conduct t?iestu<Jy." * ^ •- 
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, ^§o, we^re basically talking about a set of paper-arid^encil iristruntOTts--attitude scales, checklists, 
^ that sort of thing— as our measures for the outcome •fev^ua^i^^? ,, asks Dbriria. 



"That's llight." 




"Well* I have a couple of concerns about that approach." Donna looks troubled. _"Fjrst, ^ow can we be 
sure that thbs6 instruments will really measure the kind of impact we think bur program has on the kids?" 

"There are no guarantees," Ron admits. "Thegjt^st way to hel^ 
program impact is to use instruments that .have a good track record— tijat is,^^chometric k data on their 
reliability and validity-- arid for us to examine; carefully the items on the irisd^merits tb_satisfy_ ourselves 
that they tap the kinds of attitudes and behavior^that the PDY pr^p is designed to affect. One of the 
things I can do for you is explain why some items thai don't appear to directly address the issues might be 
useful. L Those iterrj^ in our jargon, ddn't have 'face validity.' Some of us caU this the f interocular test 1 — if 
the reason for its being there doesn't hit you right between the eyes, it doesn't have face validity. But there 
are lots of good measures that don't." 

.....77. "! see." Donna nods. "My other concern is that we might be relying too heavily on paper-and-penciT 
types of measures. Shouldn't we dp some observing or interviewing— or something other than just the 
instruments?"' ■ v V •„ 



"Yes, we could," Ron agrees. n ln fact, it is best to use more than ^nemf^hod to measure ^anything. 
Observations, for instance, may be the best way of looking at the whole, dynamic of your program without 
lL m jtr n S ^ tests require. But that depends bri your resources; 

interviews 'and observations are vepygonsuming of stpff time, as you've already^ found with the school 
»cbrds." '\ ; ' ; ^ * — 

' \ * Va — — - --- f - , 

<!Weil,«. let's at"t ea st consider those possibilities after we see what kind of resources the whole 
evaluation process will require^OK?" asks Dbriria. / ; 

"Of course." . _ 

. "Ol^jtori, what are we going tb do, with all these 'data' after they're collected?" Joanne wants id 

know. . .? < - 

"Well, with the kind of data, we'll b4 collecting. and the design we're using, the only real limitations on 
the analysis will be the amount of resources' you can 'devbte to it— particularly tfto availability bf compute!* 
facilities. And J should be able- to assist you. at that point." ' : 

. ■ v _-. _ ...... _ _ ... 

"We've used the computer facilities at ttie Used! university jiv thVpast, but only fbr some very routine 
'tabulation activities," puts in Donna. "Maybe we ecmld arrange something there," 



"Check into that in some detail, Donna. All these data/ won't be much goo<jTif we can't analyze them." 
"Could ybU give us an example bf what kjnd, of statistical analysis, might be used?" she asks. 

+■ "We'll probably usfe Analysis of Co variance on most of the outcome data." 

■. /■ '. __. ._ ■ J. :c ■ v : Q * 

"Explain that, will you Ron— in simple terms, OK?" - 

"Sure. Basically; this analysis will C( ^|l£^^ sabres of PDY kids bri the outcome measureiat the end 

of the PDY sessions with tho^fe scores • of JQ|j^f^ who ^o not participatejiri the PDY program— stlhtistically 
adjusting the scores for any differences that^exist between the groups on the pretests." 

"3o, we're essentially comparing the draount of change in the two groups, rather than "the Bb^blute- 
level of their scores, right?" asks Donna; ^ « 



' "Yes, basically that's cornet." , l *'-" m v 

v. 4 - "Wiit we be able to measure the combined Effects bf the prbgrSm across all the outcome measures- 
sort of the overall effects?" Joanne queries. . - ■-■ , .,: 

_ "Yes, we can, ^ut that . wi»*requirfe the use bf Multivariate Analysis of t Covariahce. There are 

'tradeoffs here; On. one* side, it will cost more in computer time and h&jUire substantially tribti analytic 
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effort by a well-qualified statistician, and s interpretation by us. C^ -thev other side^ the Additional 
information .that could be developed may tell you more about the iriterpffiy of the different components of 
the program!" , , ' _ - ; «. 

. t ■• '. # . j.A.jl 

(Donna, Joanne, and Ron then discuss hoW. the, outcome evaluation, will ^ 
specific roles and responsibilities. Ron emphasizes the need for a pilot test of the instrument pack&ge on a 
small but representative sample of students. They discuss in great detail the resources required to prepare 
for, collect, analyze, an<J interpret the butcame data. Donna is especially concerned about this, since she . 
was TT burhed" in her previous experience wittfan eValuator who drew up an elaborate design and dropped it in 
their laps. "Only later did she realize that they did not have anything near the resources needed to carry but 
this grand evaluation. ► 

Their final decision include Mujitivarjate Analysis of Covariahce at this time, given resource 

constraints. They then mo^OK a discussion of the process evaluation. As we rejoin the group, they are 
summing up the plans for the profess evaluation.) * K p 

"OK," Donna says. "Let me make sure we understand, what this 'process' evaluation is about— and why 
we're doing it!" she laughs. ;■ ■ • 

^ "Fair enpugh. Go to it!" j : / 

• » "Well have Observers in the PDY classes recordinf ^?h^*$ession events ,oh a form that you'll help us 
develop. These observations will produce narrative descriptions of session events, ^his narrative could serve 
as a foundation for the future development of a forma}, quantitative rating scale of both student and 

* teacher behaviors during the sessions. Ami right so far?!' , v • • *. L 

- ^ ; ;■ ' 

"Right. $nd the number-ofUimes ycnTdo the observations— the schedule for sampling the sessiofi^-wiU 
depend upon whether your own staff does the observations or whether you can enlist some volunteers*^ Als^ y 
remember our discission about the importance of the observers gaining . the trust of the students and the 
facilitators, and remaining 'detached from i the conduct of the sessions. 1 ' ' '; _ * : * * 

• ■ • ■ T . _ V. __■ ;' • ' 

* "Right— yes, we can't forget that," agrees Donna. "And this information will help 4^-tell us whether \ 
our services— the PDY sessions— ate actually being presented in the way we intend-rcorrect?" ^ 

"Right again." " ■ . " . * I- 

j r . , ; _ ; r _■ _ * ■ : 

(After a break*, 4 the group reconvenes to discuss a second evaluation design for their alternates 
proeratn. At this point Joanne Martinez leaves arwLJim Cook, director ~of the alternatives program joins' 
Donna and' Ron. Jim begins the discussion with a dgs&ription of the program,^felle4BrightSide Alternative^ 
for Youth (BAY). BAY is housed in the BrigLhtsjde YbUth Center and utilizes its extensive recreational^ 
facilities^ which include * basketball court, .a room containing a taking ring and weight-training equipment, 
and a game room with .ping-pong and pool tables. The orgaHfeed sports activities include baseball, 
basketball, boxing, volleyball, atid weightliftirig. The social activities consist mainly of teen dances held 
every Saturday night at the center. Jim has three staff members who double, as counselors, and coaches. 
Counseling is done on in informal basis: as staff Identify needs or problems in a youth visiting the center, 
the youth is asked to step into the cpuriselbr's^office to "talk for^ a while." Ron is now asking. Jim about .the 
youths who are in the BAY program.) _ • ^ . ■. ' - Tji* 

4 "So the kids who .are in the^gggggp are of all ages, and mostly Hispanic?" - ..- 

"Yes. Their ages range from 6 to 19 or*20. Most of them are^Hispariic; the rest are a^nix of blacks 
and whites, from mostly wooing class families." : - . ■ 

"Hq>w marly kids are in the BAY program?" ; . . k 

; "That's hard to say," Jim replies. "It depends on whether you'count.the after-sch&ol dromns, the kids 
who come to the dances, or just the kids on the teams. I could tell you who's on the teams, but we don't 
keep track of the dropinsror the kids, who come to the dances:" - a 

J r • ■ " ' ' .- • ; > 

"Are any of the kids^ referrals from the courts or troubled youth programs, etc?" , 

"A few," Jim replies, "but nearly all of them are just kids from the neighborhood."^ 



"I see," says Ron, looking a bit perplexed. ^ - H n * • / ^ 



4 I 



"I guess it Sbulfli'lcina of disorganized; huh?" Jim laughs. 

"Weli,jt f s pretty loose and freer Sowing, fc*t that's how tft^^ 
of objectives says that your prbgrarti is intended to 'prdyide la Vide Jjfcpge g>f healthful 
neighlwtiood youth . - . activities that can serve as alternatives to Wug aridfajcbhol atr^~ 
is that right?" L ^ ' ^ 

"That's it." 




_ ' l O_K7"~itbn pauses, seemingly pondering the situation and what^evaluation designs might be used wjth 
the BAY program. After a long silence^he continues: ■ ; , ^ 

* "^ "Giearly, we can*! empjpy any rigorous experimental design here* You can^d^ny^buf services— the 

* » activities— to anyojie or pteS a Rid arbitrarflb irv brie i activity or another, so ari£ notions of randomization ' 
r" Mme^brisumirig 

ai^d would probably result in a very » rionequivalerit comparison group. I think the best we*can hope for here is 
to implement a pr<^e^^br\gnt^d evaluation^ perhaps eombfned with a Ibrigitudirial outcome evaluation." 

< . . v ■ ^ t _ 

"A what ■ .* .?" Jipi looks |>u£zled. 

i r ]l "]^J9 T Ty± J^t^^ V^^AfM that Hrst we should ^ concentrate bh getting s&me information an: the 
numbers and tbe^rtia^acteristics 1 of the kids^who are in the BAY program. That kinthef documentation is 
often mearungfijf^ fa arid ir>fin help you to determine whetj^r you're serving the kinds of 

,k;ds —ages and etjlrtfc mix— that you want to." 

........ * , 

y "Arid how do wte do that?" Jim asks; 

: ........ } { 

"Dkryou have a:membership list?" ■ j ■ /. 5 . ' ' 

' ■ , "Yes, but it's not really very_aceurate rfeht r now. I suppose we \i^d update it." 

"That would .be helpful. Alsbi. cah_ We. get sbm'e basic background information bri the kids for your 

* membership files — age, -ethnicity, reason i for coming to the center, etc.?" ■ ' . 

\ c 'Trbbably."- Jim looks toward Donna. "Do Vbu think Carlos could get that information for us?" 
S.'. , ;*» . "T«s, I thfnk so,^she ariswers, "although it ysrill take at least several we^ks." -_ 



•f 



''That's fine. ; Nbwfii there ariy'sigri-iri procedure when the kids come into the center?" Ron asks. 
"Yes. But Pm pbtJSure how well it's followed*; ! could check that but, too-** 



-■ % . •* '•Good., An accui'ateiTOembersJ^ip list 4 Witlvsome "background information will tell us— and othefcs— who's ~ 
in the B/yY program* aTi^ an accurate signMrr procedure will show4iow frequently they use t?S" facilities and 
; .ibr what purpose." V i * , ' '■ 1 , ' +>_• 

,.f "I like that," Dbriria^ approves". _*It!s sbmethiog ihSi I've been wariting to do fot some tirqe anyway. %ut" 
;* What about Outcome evaluation, Rbri? Are there any 'possibilities here?" i 



; • "Yes, r there are< . : 'pbssibilities,' but they're iimited^ as I indicated i>efor6. I suggest that we use a 
"?©I*^cli > __*TOj[ecM|Tg_.a sm-all» -fairly represeritativ^sample of kids as they enter the program arid 
fallowing them over an extended period of time." i * ^ * ; ■ - ^ 

"6h x . that's what ybU meant b^_a Tbri^itUdihal outcome evaluatibri,' " says Jim. ^Hbw long wguid it be?" 



"At least several ninths. Perlfaps as long-as 3 to 4 years, if that is possible." 

""Four years? You gotta be kiddjng! We might not even be here then," Jim explodes. . 

"That's true. But you have to remember that prevention programs may^take that long to demonstrate 
that they actually help prevent future substance abuse. ,You have^ to _dee|de the _tradeof ^ between how 
itripbrtarit this information could be &n<i the. cost to get it. Ybu might get ehough information to guide yot> 
in a shorter period of tilie, say 1 or 2^yiears."< r A 

1- - ' 80 > ' . r 

0 



- : "And jHd\# would we collect information from them .". . of what type* etc.?" asks Dbriria. 

./'One wayrto gowoul^ be fo select kids^ged'iO to; 14, since the main goal of .the pr<£rJrr^^ 
.... prevention of alcohol and drug abuse artd;deiihquency^ and the age of onset forijiese forms :pr^eyia[^ 

roughly in* that range. As tfcey enter the program,^ne of ybUf cbUriselbrs«£6uId conduct a fairly, extensive 
. interview with them.'S Ron says. ' : j " l\ ' * 

"Ho^ extensiv$^^ '* ' 

" ^The, interview shbuld cover cu.rc^nt and past behavior related to drug and alcohol use arid deviarice— 
for example, the past 30 days, the past year, arid initial 'experiences. It should also include some ^aggessm fen t 
of attitudes arid intentions \s well. Family environment and peer relations might also be tapped, since . these 
may act as moderator variables.", ^ ■■>■■ + f * ". ... \ 

"What are ! mbderatbr variables?!!. askf Dbriria. * ' * * ■ ■ i' : • 

' ' y • ..■»;.' * • :.y s. * 

.^Things which , may influence, or moderate, the impact of the.progiSam dfi the individual. - For example, , 
we nriight find that the- BAY experience is helfrfUl to kids from a supportive family,* environment, but not for 

"I see," Jim, hods, "but .shouldn't we also gather sqrjm?. information on their activities— how they view 
s, What they like to play, how often, and so forth?" ' . - .V. ; ' . * 



others." 



sports, 



r "Good idea, Jirri. The impact of the BA^ program arid Us activities. will probably be influenced by the 
stance the, £ids haye already taken toward these activities when they ejite^he. program. '\ . >y V 

' "Then- we w6uld conduct Ihe interviews again latere ..- ' T 



T k 



> "Yes. I Would suggest at points'^ months arid 1 year after joining the program." * 



-\ "NbW/*H)briria asks briskly, "how will we -get"- this interview devel9ped? M > - 

r fficu^t jBJskLiorjrieto assemble^ draft" int^weW- iristrUrtieriti but you'll tiave 0 to trairj your 
^cbridji^tlTc^^ |ri^strurrient. _A pilot tfjfit on three or jfour^kjds, coupled 

^tion^f the results, would give jwKa better notion of the re|ourcfes4hat will be n^Uired for* 
towfl^aiHfcfion. Can you do' that?" A v • -> • ■ . i • vfift;^" 

"Whaydo you thin^ ?im?" asks Dbnnftr ' ' ' ' ' ' ^( 

Atfe can handle that. ITsTfieTfiictual interviewing I'm w^rrfe^ about, fipw mqny k'jds are we talking . 



about^^e? ri 



"A small grohp,^ Ron regies. "Probably rib hlbre than 30 kids -orar a 4*to 6-niopth period— assuming 



yoji get that many c& thyright,age gVoup entering the program oyfer that period. 



10 to 14*a(ge group* entering the BAY . 



"Mo probjfjjp. We probably have at least twice that in 
progi'arii over a b-nVontt^ period. And if those are the numbers th^t^v^'re tklkj/ig about— 30 or so— my staff 
can handle it." 

"Are we going to need £he computer to analyze tfreseTdata too, feon?" asks Donna 

; v " ^ - \ _ x ' : . . « * ,* Va , . • 

, A ^o, I don T t think Sft/ Donna. Our Sample size will \e quite small, and the analyses will be ^mainly 
descriptive and t^ualitato^e* not the kind of complex analyses ybaUl be doing with»the PDY date. Still, just • 
the mantiaTtabulation oftdata Srid qualitative analysis will- require time from y^our staff— perhaps as .miich as 
savcrnl^eeks*af time." V \ ' ** * * » ' 

*l iHrnm," Donna' looks concerned. "This evaluation work sure ban deVour resources. What iTwe can't 1 
spane several vieeks of 'staff time?" . > . - - , . . / ; 

v., " "Well, you've got a couple of options as I see it v One: you can*dcop the outcome evaluation for the', 
BAY progrn m and just concentrate pri the process evalDatioh. Twcf: you can cut back on the length of the 
interview and on the amount of aijplysis. But yoq can't reduce it too much or yoii'U Have very little of value. 
Rememben your 'return on your evaluation foliar 1 , as it were, is "fairly meager with' thia type of outcome 
evaluatiQct— iri cbritrast to the PDY^bUtcbrrie evaluation," Ron points out. * ^ 



^2' 



"Would it help to eut.dovwj the number of interview sessions?" ask5 Jim. 

s . * . . ■ »:? ■ ■ .\ V ; * ;• • ■ " ■ i 

"Somewhat," but only with respect to Me total persoh ; ^6urs;6ver the entire course of tfietwaluatibri. 
For any given period^ you Wdii^ still,have to devote the time to interviews, analysis, and writing-" , \ 



(The group then launches into a discussion of specific rotes and responsibilities for the BAY program 
evaluation. { Ron's visit is coming to an end, so thej^conclude with a summary of the l overall desigil and hdw 
it will be carried 'out, over the next several months. Within 1 month, Donna will send Ron an outline of the 
plans they have formulated for both evaluations- Besides helping to prepare the instruments, Ron will also 
be available to review the pjlot test data and to pfovfde assistance with the analysis- 



* Several months pass. The. evaluations have -been implemented, and Ron has returrred to* the Brightside 
Youth Center to discuss the evaluation-^results to date, interpretatioi^ of tiie_fjhdin_g_sj and utilization of the 
results. We look in*on the group as Ron strides into Donna's office to meet with Donna, Joann% and Jim.) 



"So — I hear you folks have been conducting an evaluation?" Ron grins mischievously. 



"More or less, Ron*" Donna smiles, too. "We certainly have put a lot of worl^into it! Maybe you can 
tell us whether it's been^worth it-" ' , - > 



"You mean it's not evident by now?" 



f. 



"Well, L actiftUyi L ^d'i%_Ji^fi^_iTf ore aware of our sRenj^ths^and problems/' Donna M^m its, "but we do 
need a little help iri deciphering these results. You did get the drafts describing the results of the analysis. 

"Yes,. I did. ShaU we start by lookjhg 
"^Firie," agreed Doriria. \ ; 



"Well; the results refject an interesting mix of outcomes. You show some impact— significant 
differences between EDY kids and the control kids-^n self^i^ one of 

the stress i management subscales', and one of the interpersonal skills subscales. But no.«effects on family < 
dynamics or on self-report of substance use. v ■ . ■ ". > 




3roiight\alqng a coupje of illustrations of the data in dNBo explain a 'significant difference. 1 First, , 
if look at the top of f igu^^you'll see a portion pf an AnaHfe of Covariance Summary Table. This was 
extracted directly frojri the computer output arid shpw^the results of the 'F-tes^fbr ^igriificajpfce between 

A. Portion of Analysis of Cbvariance summary table for self-concept ; * 



Source of 
Variation 



Sum of 
Squares 



Degree of 
Freedom 



JWean 
Square 



Probability of 

F * 



Group 
Error * 



74 
12524 



1 

620 



74 
20.2 



3.7 



.05 
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r^jj^Vthat is, between i PDY and cohtrprfe^ents-tfor self^oncept. this tells us that* if we repeated this 
fjra»L00 times* in only 5 cases ^rbuld the difference between the 2 groups' scores be this large if there was 
£Sal difference. The bottom -part which J sketched out for you A illustrates this difference 
pSRhicaiiy. Both groups hay e ^sentially the sarme ; seif-concept as measured by the pretest, but_ihe PDY 
iPbiip has improved considerably, at the pbsftest. frftis dif ference— which looks substantial even to f he naked 
"liye*- is what was found to be significant in the data analysis. 11 ^ 

. ■': T •••«*. - • ■ ' ' - - 

; ^ "Analyses of these out com I measures by school and ethnicity/ 1 continues Ron," show rib significant 
differences or iriteraptibris— " v«V 



"What dp you mean by that, Jton?" asks Joanne. 

? "The school and ethnicity analysis? 11 

v . ac * ; 

"Yes." 



"It rriea^Pthat the effects, of the PDY program are the same^for each school and ethnic group. 
* However* there are some interesting differences* by v sex." 



"How so?" asks Donna. 



"For some reason, tbfcPpbY program has a greater impact on th? interpersonal skills of the boys than of 
the girls." zS ^ . ■ « . 



"I thirik the boys appear to learn more the l girts," ^explains L Joanne, ^ecause of 
the sessions whei;e we focus on ways of relating and communicating. We emphasize to the boys that it's not 
effeminate to be- social and, express your feelings. I think most of the girls already had fafrly well-developed 
interpersonal skills before they jbiried-PDY:" : ; • - v 



"Certainly a plausible interpretation," Ron says. _ "In fact, that's what the data show. 'If you look at 
figure 2, which I.al$o sketched out, ybu can see how Jbaririe ! s_expla^ the first graph 

indicate^ the interpersonal skills of the PDY group are much higher than those of the control group at the 
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"ft 



posttfest. • SBtJthe. ^i^era^onVbejtWein sfflfc and is ijli^tcate^in the bottom two^grajphs which show the 
9W^s-J^>|>oy£jand girls ir^Jbotjvgroujgs.y ; # t He gfrK imfeoth grcfiips scored higher than the boys in the pretests 
^VJfc*?W PJ?.l^£oup,- tlfej&o^ 'cwfe^f 4jp f j_-tp the g irls at, ftje ]8>sttest. So thjrt ! | the^ 

' ' ~ ' "* M • " w ?rence between the groups at the posfrest is due to «■ 



difference between thQ twojpoujte. The significant 

the improvement of the boys in the PDY classes. WoW, why fife! effects on family dynamics or substance use? 



4 These are pretty central criteria for your program." 

:« _ "Will, I don't think we should J^ave expected tp jfif^ says .Joanne. 

* "It's too powerful a force for u^ * < 



"I- would agree, and, as we discussed before, I don't think you should be disapjSftnted by the lack of 
impact on actual substance use in this short, a time : period. To really 'asses? the ef fects of the program bh 
substance use, you should follow these kids for another year or 2 ^hen "they are in the high-risk age range— 

15 to 18." ' - . J - . 



"Oh boy, more work doyvn the road." Joanne castas bemused look at Donna, 



"Just tryihg^o keep yoii busy* Joanne*" RpS l^jjgMs. "I was happy to see that you could use a standard 
pattag^ike SPSS'for all of the analyses^ By the way, whp did the computer analysis for you?" 

CleinfelSx at the ui^versity, f VJ?bhrta "aniwers. ""He* was supSr. I dbri f t know-how we could have done 
it wWPPut him." \ • " 



"How^havf yoi^paid for allrthis? 



0" 



"A combination of gr€ 
through Hal's go*>d auspice 



Student volunteers and a small grant frotri the^niversity Computer Center, 
^Donna-replies. ^ 0 



- "Well, you've got some, results that should be of interest to a number of people, but let's get to that 
later. How are you planning utilize J • * & . ^ 

"We'v£ already usefd^them to alter the * PDY sessions for the coming year," answers Joanne. "We're 
disking out the family dynamics sessions, and expanding the stress management component to try to show 
more impact in that area. Also* bur_prbcess observations show that Neither the stress management nor 'the 
interpersonal skills sessions, ar*e- impfifejented in the way^jwe intended." 



4 



"How so?" asks Ron. 



"Well, both components are supposed? to be bum around behavioral exercises. For Example* the ^tress 
management sessions were to include the actual practice c# tel^atibj] i techniques by the students, _and_tJl^ 
interpersonal skills sessions were_to be ^ based on ifeveral rble^laying^xerc^es. In fact^ wcfounjj that most 
of the sessions were of the lecture-«JiscUssion variety— the students of ten* appeared bored and distracted. ^1 
think that's one of the reasons We didn't have, as muchi/npact bh outcome measures as we f d hoped." y . 



"That^ interesting. Your findings weren't j^ally measuring the^ it w§s i 

Instead, you were measuring* as^always, .what/ ^actually happened. But in this case, $6u fbuj 
difference L between program design and implementation. Th§t alone is ; sufficient argument Jo justify look 
carefully at the process." • \? . 



to be. 
major 



Th^n, turning toward Jim, ftbn^comrftehts, "Weil, I guess B£T-'s outc6i}ve evaluatiori hit some resckir^ 



probieri\$, eh, Jim* 

: rt Yes, I gifess you fefl fe j jsg that. After we pilot tested yoyr interview jnstrumeiit— ^^eh w 
for the most part, by tlByny—it became clepr tKat, at least at this point, our staff just didn't hatf 
to devote toTthe, interviews— at least: to do them 3/ith the degth arid Bre^tt^w^ thought was reqliir 

" r . " * " . _*V 1 

"Threw in the towel, huh?" Ron laughs. J[ ' < f l £y m - \ 

i: 11 Nb^ really!. We're hot quite ready, to abandon the outcome e^ 
we've applied for, we T U add a counselor, then we should have the time' to^p the^rOutcome Waiuation ^ris 
coming year. So we're hariginMn there!" ' -r^ . x : t i w 

- "Well, I think you made the wise decision; It's inter es^Tg^ttfttoA^e same thing^ happened in 
C§§e study, but prb^Bly just to <?lit down space. Our m<^ve-i§T3^^ We^^fburid but* throyg^ pilot 

: S "-75 84 • ' • . • - " 




t:e9tingt that We just d^t%ave the Resources right now. W^rp you able to put sbrrie ti 
Evaluation? 11 . W r * > . ■;• ' 




the 



and 



"Yesj That's ko%P^ .^etty^Seih?. Jl'ni^-^jpli^s. "We've been abie*tb update the membership infdrmiaV 
document' the daily flow.afcfckte into the' different activities." r 



ess 
ion 



"Has that in farm at iofi b^en Tielpftil tb^oii utiny way?tjlave ycrtf made any changes in your program?^ 



4k 



["^e^iip lists on a map and found pu^that we haven't 
t near\here-iso we've started. a recruiSlng driye in that 



" Yes^k , !has," Jim jaff ifrtis.^ "We plotted ^ 

been attracting kids fr^ ^ , 

area. We've picked' up : soBne kids frbrri_ there in the past couple of months. AlsoV an analysis of. bur daily 
sign-in L sheets showed a hug^e l number of kids who were drqppm ffienlbers of 

any organized activities.^ That's OK, of course, but we're trying nqw~tb persuade somerSf them to get inore , < 
involved. W^Ive plotted' that oh a map, too* tjgbelp lis focus bur energies*" - ^ - 

• of the map— graphical displays aren't used^B^uentty enoughs 

also tifelp ydu to get^to the data and explore its meaning^aften producSlp^p^mation ybu weren't looking 
for When yoii started blit. - ' " 



"Was there any^resistance to the data collection cm? 




e staff?" 



Irt;&-the 

. . _, : :jfau ffi f :: *' V : -• - 

' ^Both,"^Jim answers Ron with a laugh, V but only in thejfceginning. After the first cpuple^of weeks or so, 
they all settlecHnto the routine pretty weft?' ^ ' 



"Well, I'm glad to hear it. Good luck bh the grant." 

"Thanks. I'm sure ive'll rieed.it!" * 

D • 




"Now," jjoji asks Donna, "Hdw are you planning to utilize this informatian^siteiTOtl^y?" : ' m y ' 

"Wfcfbi as ypU knoWj we're j>uttihg^tqgether §LCbmf)r^fnsiv^ b^h .evaluatipnsi This will be 

sent to'theState and the United Way— our majfcr funding ^fyenc'ies*" *% . ^- 



Lwe'vS planned." 




* 



th iqgg. First, I thiptevyo^^hbuld dgyelop a conderisef^ecqtiTC sdn>marj T of yogr \ 
findings, sui tabli. for -sending' iBB^e schools an£ omec^iWmfiy'be interested ifi^yqur ?in&ings-r-such^as the - ' 
mayor's 'of tice r —piit who don'fjgRmt tp pour throjigl feg nrlj fc mo t^? report Second, I think you should send 
these findings- to otb'^ and prflrnljjfl t und ? rs ^ Dbii't forget th|S eaJh^ 

organization nf&y. b^iflie^estetf in a different aspect '.of, the evaluation. Scfigj^sf'fortxampie^migh 
hear more about the: 
your irnj&ct so faM 

arrange to make .some* pers,t7imi (jrtjseiif aums «s ve^u* i utuvs wit?y nave mure iiupctcia . ..^ 

, "Those arVaJl good ideas, Ron^^Noj^Hrwe can find the tifne . - ." Jonna adds ru^fl^ly^ 

'£ .uhdersWnd. Ix>bk^,,ya[i fblkjr have\dbrie a great 
evaliiation Is no eas$ task. M - i ( ^ -* iv 




seful t ! 




f V Db^H^Bys em^aticalr^^s J^^irie arid Jim nod in, agree ?o 



evaluat 



than h 



hi it would be." 



Job— you're tb_be ebnfffAt6lated.^J)oirig prograri 

< f But»^!s already be 



"jLtls^trarige^ ^ufll that's the Epical response ' I s^ar frbril. pjogf am^ aft' 
jation. towiarbly it has m^re U^lTty than the^ thought it would."/ * 

> ' - " i- i- v.' ' ■ "" W^" ^ 

"Thank ybu for your assistance, Rbri^" says Dbnn^j^yb^^eally been a Jielp tb t us." 



5y T ve co£ripI?te<r Sri 




••VHGlad to 



y-it." 
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FOUR THR 



Pljjflhincr ftri Fynliifltinn nf i 




ISCUSSIONS 



Characters: 



raining Prevention Program 



i o Pamela Raven. A program developer in the curriculum depart n^t.ofjJjpe C^nnamoi/Bend Unified 
SchoolJDistricj:. . : _ j 1 _ M ; "Hfc 'W ■'- & " : 'L * 

6 Lacey Strait. Cijihariioh Bend Scti I' • %. 

o ; C6nrad/8jzfcr. A local fcyj^uat^ District j>y NfERN in response to a - 

request for distance/* ^ f _ ^ ^ ^ % ... x : . f --".If 

, o Allen Compass. A second evaluation consultant referred by NPER*$, anda colftague 5f Sraer. ; 

• * ■_**,___■ * ..." ' » 

First Discussion . A bright, Cffip day in lite fall. Conrad JSizer and Alien Compm Ijave juart entered the 
office of Lacey Sthait. Strait and PamelA Raven a£e seated around a circular table. -^Thfiy fise and Shake tHe^ 
^ ?y alu ?tbf^ hands, of ^ by^Compass^ _ trade a comments about 

politics an$S lite in a bureaucracy, «ti^cpencils and pads./ A tense sifence^ threatens to settle. Som# 
throats clea^. Then, as if it vgefe expecteOfc^he discussion begins: .-"i 



Strait: Well,£ asked you folks to com § to this nieeting, so I guess Fll s^tart it off 



Compass; Sounds reasonable. 



Strait: As 1 think fotbld you on the bhohe^ Dr. Slzer(jPam Raven here has develop 

jnag^iflbent ' Ji tt le_ grogr^pi •_ -"Tt ^ intended to be a sdrj of indirect waySf preventing drug 
by adolescents, but it seems to havjp alot of other things going for it, too. It'slfeen operating for 
\ -a year f now* win by several teachers iii_twb Schools. The prcjgraTm is rea^y great. *We_calI; 
3^ Cooperate aitjfr Profess Project, or CAPP, by the way. In fact, just about eVefgSftdy lo^es it. ~* 
:\ love it^the kras loffe it,4ndt>ar^nts(love v it. 




unched wha 




\ 



Sizer : s Wait a mihilte. Do you mefift tS say (fiat* thep's nobody who doesnM love it? Iffthat's the ca£fe,> thisy 



must really be a first in education! 



Strajx* Oh, of course ^Sere #jere a few parents who didirt waht>*hetc ki<^ Jo b 

Q^ii^e j^oup^ who have com^ ^^Qrhmimjty' new^pat^ers^and 

compared to m^et of the new programs wcgpe tried, there hasrrt rewiyJ>een much .cr^icism 
the facCihat everyone likes the program, our school district is in a funding cninchV yvent to 
y. Education meeting last monfh with Parq„ expecting to request more rppney for expansion, agff they told 
us oirtbf the bli*£ that all special teache^Mining^funds wotild^be ctitrigxt fisc&l year, ^~ A — -* 
^ they announc ed . t hefr' jintfrit^b go bac^/pw^Miae*^^ P^i am I had to do a qiii 
<®flitfnce thertlhftjust ^c^ider\iaintairjmg it. * 




ie^hjhg, s^i^$e(^leMh' 
tH#like--but 
spite 



Hmmm. 



Rayteh: Yes, we got a reprieve*^ £athfei^tf 



w- ~ m 



A&nsider continuation bhli 



/e can she 




tting us 
therflsthat^hc^ program works 



•immediately, Dr. Strait ^on 



Sizer: ±ii looks like We knoW wj^o we/are^ evaluating for. Now, t^e question ik. wfeF'do they meflrfi by 



"works?" fir- 



.X. 



^^aven:^ 1|iey are concmied about showing that,.it prevents substance abuse 'and o 
v!Lp". tji^^^ejt very flear that if Wei can't demonstrate tha^the prc^ram tea 



well as traditional methods, it's but. 
Strait: I got angry myself, sinoe we know the progrdm* 



Sizer: But how qtm you be so Sure the program is wt>r!cin£^f ^ou ffaM€»Vt^ e^glu 

"Raven: All^^tJ^ave to do is look at the closes, look at how the kids are getting 
^ their f^Ces, talk to them a little - m ^ J 1 ^ y 






Sizer: 4{a^ ypu^ddne the rfS^lhRigs^ffh kids and classes that aren't ih^the prbgrkm^^ V 



devi^^fcfe'fiaviors— but 

the basics: atl IcjasL as^ 



p thoseclasseS^ lod^ 



r 
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Raven: . WeSfnot as much, I suppose, I still know. ^ 

Strait: This is turning into a debate abodFthe interesting but 

- is not what we're hijre For today. I^we want to cbntthue* we've got to evaluate the program, so we may 
as well . start with^the assumption that that's [ what we're going tff do; ' 

Sfeert That's a perfectly godttreason to ^e an equation, In fact, most evaluations have syiryival as at 
:" least a' partial nVotive. I think tKerite a po^^e yalUe ^ evaluation— 
v. sometimes in conjunction with a sfmu^neous collection of subjective impressions of sensitive 
observers, part rcipattt'S, and sobn. . * - 

; • _ * — - .-_«!_ • l: . •• • • 

ComUBss: W,elL thSfe«art process' evaluation that i should naturally accompany thS outcome 

" ••■ > ,: • Jfe • . / . ... 

Sizerr Yes, it is, but that*? closer to thl^ndi^i^ 

to draw a distinction. i • ' . *W * ^'^V 

-' - - ' - -K- ■- -'■:-»*■•'.'-:.■ ' - • : ■ 

(A brief silence erisu£s. y Those with coffee sip.) 0 , 

? ; v . " ' 'w i X 

^Compass: You know, I just realized ^mething. l-don't really fciiqw what we're' talking pi^t!j; W^to^^ 
y supposed to be' discussing the evaluation J5f a program, but the only-thing I know about it is whatwil"? " %i 
V discussed on the phone, I?b you think we could hear a description of it? J V r * V- ' , 

j m . i.* j _ : ■ • 

Strait: • Yes, that's hdvv Mrtterfded to sta^t, but ^e se^m to have gotten sidetracked: Pamela, since you 
: X developed the program and know the most about it^why dpn't ybu give a br^ef description of it?* 

- . ... - - - . . r r. i^-- ' ;_'*<"'"_; ._ _ 

Raven: I'd be glfid toy (Looks at Victors.) Please interrupt me whenever -y^vhaveva,questiqi]i._ WeH, the 
project got started out of dissatisfaction with some of the other approaches tp drug prevention witft. v 
adolescents <ahd pre^adblescents. So irony programs have tried to approacfh the>{>r^lej^ -y; 
horror stories, cewafds—tfie kids _oftenVee Jtfem as Lbribes—qr large dpses ^fcinfor matioisjjt seemed to • 
j, from watching Some of the programs in operation and front talking toVpin 6 of the kids, thaj these , 
fii^ct approaches made' the ki^ resistant^ suspicious, and riegatlVato* They saw it B propaganda b^ing^v 
forced on tiiem by hwrbwnund^ work betters WL^^ 

thinking about an indirect approach, it seetned to me -that iristeacfepf focusiiig,on drug use^er ^e, ot j ^ 
9 even on attitudes specifically ab^ut : drug Use^ it m^ht be pester to focus oh_«>me of theJpM&nolpgfcal^ ' 
factors which seeffci tb^ pr^ispqae kids tg^rd^isirs drugsH- if my reading of the reHBrch Jfle^ 
"J c<^rect— things like l6W"s?^re^enij ldkt%ee^ngs of personal ^ control over the en^onm^nt, lo w__s$]f - - < 
* control,* and ^he^ftke. ' The ide§, then, to , develop a fchool program which would have meaiyigful . 
effects' on these'SmdLs of things fairfy <&sctly,* ai^ wouRhtfieft in^Mence drug use and drug ^tjjptudes^ 
j s onlv^^ug|i ; itsjn^e^ ^ % & 

" Sizer^^ lik^^our thinking, >but that doesn't s_e_ejm_ like a __v«^ .easy fasR ypU set for yourself. The^e " " 
jhplbgigstj- factors^ jfe you call them, sound^ like ttm|s JjWrt are fairly (teepj^ it^^ineS in. the 
perfpnality. f would th^k they might be even fiaider to chl^^^r- influence \hah drug^ise! jj^ 

Raveni Well, first| ; thahk jfe^fer jfhe cojmgliment. As for yq^ fc se?8S3 comment, I thoi^ht that^wfl^ m| 
• f* at first, v?hen 1 saw- Wmch^ p^choiofeical factors had b^el found \o be related to drug use, But 
Lacey-shovMjJ. mjf some descriptions of ''ijbbperatiye lefi^nin^ ' gtpups.". They've _been [used ih 
classroom s^^seg^^a^ ahd cla^rooms with nandicapped or Omainstre^med" childr^RTand r 

^ve shbwnjefll^Sts on some i.of t!re very, same wiabi^'th^ haye beert found related ibr-drug use in * : 
^Mblescents. SS^it ieeme^ like it might be worth t_^ Jp see if 

It did have some effects hn drug ^.use and drug attitudes, sfcsia^worked up aT program and }gpt epA 
teaSters to try it in two schools, as Lacey said. - , I ^ 

' ^Ite }___ * j**r\ * ' -\JW ' n • 

^Compass: -What do tlhese cooper^tn^ gOQups do? How^o thesBaiffer m>m regular <ffiKrooms? ^ # v / ^ 

^aieeh:/w^ls^ ^ use Ui§ sarhi«OTic^m as UiO^i^^^a^^m^ ■ / 

^ y^y^rtfe ^the^felyes a^may^competing wi^pSBSrs for grades, braise, etc., vre : try to s^t i«ip to 
that tpy benefit from each qjtjier's learning. We use a method ceOljbd "Jigsaw," developed by ElUEpcC^ 
Aronsop. It's called Jigsaw be^ise of unit of curriculum is diviaed I .int^^^^^fe^^^^fi^^- _ 
tc^thW^Shle kids l ina grdup. Skg the cla^'fs cov^clng jc mit on ^e ClvU War. • 
i^to si^6er^>n group^ and; you divide the Civil War readings into six sections. -JEfch /n i ember of each 
'* g^ot^fc is reroqnsible for- learning-one of the ^ctioy and then, teaching that sejgion to all' the other 
members of ^at group* Beforetfeaching t^S^iito the other greup membei^Wfc kids^from each 1st 
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* the groups who have the same sections to, teach [get t^etjfer and _help each other learn that mat«riid 
and decide on the best ways to^teaeh it toi-the otheiv group members. &clpi^Rembef of the group 
becomes responsible lor the learning of all of the QtHer ^ 

Whole on that unit of the curriculum* So «no one individual can benefit unl&a^j| the group members 
% ' learn the material well.' Aside from learning' the .curriculum v^ery w§&«l^!|^*oach shown to have 
pNQgjjtive effects on academic acHiev6meht^kids in these classes learrt>t6_^g[ ^attention |o_J^ needs of 
..l. 'bmir kids, to adjust their teaching^so that each of the others m^s^ 

concerned aboiif other people, and ti}ey learn that they can really make a significant 'contribution to^fje* 
welfare <5f eve?yor{e ^the grpup. This helps them to feel better about themselves. If you watch a class 
* that's going well, you can see this happening!/ ( ; 

Si^er: . WeU^wel^arj't use grade^as a measure, since the kids are graded by yieir own teachers,. Do you use 
standardized readiness or achievement tests? b « • 



Raven: Sure. Every class level has a jprbad achievement test at the beginning- and end of ea^fi i school year. 



Sizer: Where "dp they keep th<$jb records? 



tfrlfo 



iung 



Strait: Oh* dri'that Fancy computer! Do you knb^thafjyhile theyre frying tb^t fcack teacher 
they're planning tcTbUy an even fnore expensive one— aft et only 3 years- - 9 

Sizer: What other student reeords do th^rlceep on. it?, ' • 

___ ___ _(__- _ __ # . Jl_L.^^J1'Lj:U -iJV % !-"j?S^--- - — - -- ~~ ' 

Strait: Everything; Th*y keep track of absences, tardiness r ^ 

Sizer: Grjeat. That will cut dbwH on data ^ collection cos^ 

pencS? tests "are my stoqjyjptrade, but behavi^grt me^ures are usually the B^st, provided they arfc 
directly r elated,to the i obJeSSves. TRe standarw^d tests ^should be good measures of academic chadg^;. ^ 
Absences h|^efbeen_ shown to be A^c[ated_w|m substance abuse and, in fact, a host of delinquent -V 
behaviors, disciplinary actions speajc'for themselves. 

St^ut: 1 WeVe got-HV be concerned with cost, because the board won't give lis ( ahy extra money for" the 
- . gvSluati^fii^yjiat^s one of the reasons we called^^ 4 ^^^ ''*-■ 



C<"pfpasjsi We'l^keep that paramount when we xievelS^BS 
tb see. bne«6f these classes operating. Are arty, oEthf 

.■ ' V r 

\ . * • . • * 

Sizer: Pd jike to see one* too. ..- . * 



e^luatibn plan. 

Sflsfqn now?* *, 

■ ■■ 



Raven^ Yes, th^e are several, ami you'd be' cnqst welcome t^cbme and Visit- 



Meanwhile, l f d rea^ ,* 



Strait: BefqFe we set-up ati$ 
• evaluated.- 



specific visits, 



<youi&*oniments on whether or not the'progrim c 




^Kfij^r: It seems tb rae, from what I've heard so far, tha t a frajpfole evaluation design cogj^Lfre developed. 
i_ You s^efh toha^e; a fairly cleaflfetea, of the m a joc^jpfc Ses you are 'trying to influence, bQjh directly " 
and indirect!^ and at teast a : rudiment a^thebr e it ical rnbdel that. lays buf some of the ftechariisrfi^of . 
iilfluen^e^ _ Re process in the r classrQonTs sounds jajjj^y weU^specifiecT^hd observablgr^^ld reliablei 
measures of some of the psychological factors alreM^ exist.^I'd like to see som^ of the classrooms til 



Ssi^ion before making^ final decision, but as of right no\«^B& say that a decent evaluation can probably 



i he developed. Wftat do ybu thii 

Compass: : I feel certain that a \ 
But first, f£jgcey coni irhen iti 
x S^tevemeot. YPii i J T l i ?! 1 t 
^yC continuing the program next y| 
sorri^of the prelin^paries riov 



Allpn? * 

+i~J ' V - * & _ V ^ - ■ - _ - 

/aluation plan can be develoMdi and Pm ready to start-ph ltjigpt^nc 

Jere have been otJher stu^p that show positive Lefr^ts oji *^^fr»ic 

Jp persuade tbe boar^tc^use . thdse findings as justification for; 



give us^-ch4nc^ 



ite it. If you can do tfiqfc we can ste^f 



r Strait: JVell, let me make a 
few days. TheX 
Goorift^fe; Pre 



heed apy 
ut it 




n^ect (iflinds Simt jsev 
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fe'stlorr, ih«oi. ** You ^hd Fa m can set up sbrfie visits to classrooms iri the next 
;it \or *a while. Read some of* this ._/materiai.^we!ve_ put 'tp^ethel' on the 

m ^its), discuss ^ff^witfi_ {ach ,ot herf ta jjc to^am 
it has operated in^the ^t^c^^W:!!^^. 
a call and let meyfenow if/yoii^eimer^t€^- in 



^Tfoinnation about the pi 
fiext year, amt^Sen give 
on-it with us. Meanwhile, 111 go Sa&j&y th< 




ard. 




Sizer: Right; . v ~* 

"TTdmpass: Sounds good. £ \ i ■ 

Second Discussion . A dte,,caid '<ft*y in early winter. ThVsame four people are sitting around 
table in a /rieeting room Jri t^ at the university- Tables around the sides of the 

. piled neatly with ^ '■■ 



an oblong 
room are 



Strait: A^.ybii. JcnoWj the board apprbvecf our coritiriuatigp - based bh our nummary of the literature on 
academic aj^ieverheritj butwe still hav_e*tb show tHatiV^prks here.; So let's look at the .evaluation 
drafts You folks really did a nfce— and, what's even betj^r— a quick job developing the draft of the 
evaluation plan._ 1 want to say I'm really glad you decided: to ta]ke thi&bn. I yjink w^lre going to work 
; well together- Pamela and I do Have some quest ibhs^about a nupn^b^r of points in ypur plan, so maybe we 
can just jo through thertl. My ^ first ^question i^this: why on etfrth do you have those observers in there? 
V |t f s goin^ to disrupt the classrooms! . . -* 



Raven: Lacey, please! Calm down. - 
Sizer: Well* there are a lot of things we need toJbbk at 



W6 need to look at the psychological change and . 
academic achieve^ the prc^ram, fiftld we need_ to J 

look at the more direc^y ^prevention^eiSted variables. Andj. finally, we have to see what's actually 



going on in the classropr 



^ Strait: Yes, Ait the number of hours of ^observation yoju're calling for is going 
teachers won't stand for it. ' / <* 



wreck the program. The 



Raven: I'rp fcfratd I agree with" that. . Remember^ w# haye two major. interests. We want to get sohte ideas 
about the psychological processes being" affected, binTwe also have to satisfy the boards just to' keep the 
project going. * : \. , ' ' tT:: 7, * " ; 



Sizer: But don't #bu want to know, in some really .weiWowmfented sense/ wheth 



you think 
districts? 



it has? Afid if it is e 



ve, 



aren*t 



you 



Raven: Well* of course, but— 






it 




having Jhe e 



.<wte^3ehbbl 



^y, and "certainly; the most resDoAsib^e v way to^g^ the prQjs<?t firsyfifewn, and later 
by other districts,* is to have ^ira effects clear^ and rigorously ^cwu^inted 

Iid^rt't think the history of educational fads, bears out vtftot ybU ^ay, except" 
mean a Jg^f.thin^hayeHSeen taken on withbtft^tny reH3fc Evidence, at all^ 

. •.»•• w / , ... . A j.' . S-"- Lt^ -J.-'iy^i _-_ A 

l?izer: ^^nA^^QMrp e L but surely we dbh't wanfcthis thing toJ>ecoiii^ aPfad 
and the reuon^for it^ffecJ^eCTi to be raWy jvelT urraer stood, then 
those cbri&fibris^ut snpaldh'C&e adogt|ed 
it 'may souii^ ^7*^ -J^ 



ttetettc^say thisjujt 
''responsible'* pipW 1 



.at l^aStfy^y widely, hb rpatter«hb 



Compass: Tfiem's touglY words, pardner. 



Raven: Actually, 
hepct- We 
of obsi 



t tfift tc 



hink 1 agree v^ith you. Vife dbri't wan^ / tnl§' to^e6okn«)a fad— in one^ear^nd out the 
to be "sQljtjV* arid if it takes tight ^search to m«eu^lidf§Gibe Mm th£ jirriount 
tion still botherVrhe. , * • •• -.' W - > J* ' r V" ~ " 




wn to ba^effectfve, 
adbpted.^5hbrt of 
etiVe arid intriguing^ 



4, 



Strait: 



seti 



Well 
^Ajs to 1 
j se 
K acfyall 
'a^ _ 
_ Sizer: In 
* and— we'll 

we have tm assi 
stage, bu^e'il 

Compass: , C9rirad 1 



that leads m^<rari6^er^ilestiori. 
et the Jig^^prp^a5l* 'Ttfeht now, 



e of, 



teachers involve^ ho&ever, Q< 
ariS^riSJi^tw^studerit A 



rst^ I'm a ttttle cbrtfused as tb liq^iye 4 
5^_4B Vl^S^^^ littlfej ov^ji d^?en s 




1 
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would rpmdojffiy &s^ign 
trbl^ Jri jm^jDi tbf-twb%r» 
Ire crf^rooms to Jigsaw 
that l^tep You als<^have t 

mJ^rt h%lo , to cfra 




pich^ie 
"wo 
are w^ to 



w the design blit so they can 



80 



Is, Jigsaw 
lds^oohi untts, so 
Jnsin 4he analys 

a bit Easier 





Rave^: I hope it Will hel(j^ _ * *> V. < ^ 1 

r*. S: . .; V . __. . • ,: : . :.J....;_r .... _; ', > ;: ; 

Sizer: QK, what jwe have is called a 2 x J2 l_\_2 ! x £ f actual design, that is with 2 schools/ 2 gracjes (with 8; 



classes in efecft), 2 tracks and 2 teaching methods, Ailck this (going Kb {he blackboard): 




_ i _ 



'The row of bojcfej^f ©present the: actual classrpo 

(e for eontrolJ^There are 32 clashes ^ail," 
' School A has eight fifth jjrade classes, and 

ai ld four of each We\ih f the lop tr$ck r S|j 
^ we dah ramtoralfcseiecttwctbf theibur c^ 

sa^n^Jk^Tc nolds tj*Ue ?br School B, so that 
^ grajde, ^nd track. 



Ravero Is this that tight research 1 just mentioned? Do we hav 




4 



the&iiaye w 
bbl^lieWjfet I 
i«$^-i^ r 6; 
vS^t|achen 
mbinatl 
nced/desi 



ol for ev 
There are 



. . * ■ * 
■ ^ * '• 
or won'jT/fave; 
k at o&e s<JgcKi 
r^if the,lgglytracte; 
grkdje in |t»e school) S. 
to re^fve Jigsfrw.r - The* 
Vfor school, 




Sizeri l^sjis^ight as wfe can make it given the overall sUuatjbf 

metobd^tf^^ cbmbiHation* and 32 'classes^ vera Jl, but the 

befofce you cang|t more elegant. ^ L "*>s/' • ' 

^bmpa|^« ^Tesborise to your second question, to rule oUtibllfjjer e^gj^^tiogs i f ot^^J^^tei 

Jfrp4lasses, we Have to Jneasiire ^nd test the sffejtts of offier i ^Wpibilitie& ^h^nstanceXth 
could be^Wferepci^^etween^feE^twb school%if the atmosphere ^nd/or. envirbrw^fVt diW^r^Wfh tne 
proposed designy^ can also see if Jigsaw seems to work witjj one j^adelpr JycacR betfei* than the other. 
2f we just lu^ and. selected half to get Jigsaw, ^ye might just gl§|kertaiii& 

effects canceling each>other out in: the data, showing frt^veralt effect, and have hb wajr bf^reaking thS 
resuftsdbWh. - • ' _ : * ; • 

<t _ . . • zj i z. ^ *Y 

i\t\ WeH, JLhave |jjSearer picture %f the random as^gnment, but I still have~a question: Isn't the, pup^ose v 
of xando^assignmeht to make the experimental and control feoups eqliivali^t^ Your experimental 
^ y <k£lgn now tfaSii for 4wb testing periods during the yefer. ^fSSfrm pretests inNthe fall and w set of 
\ ptattests in tj^^pringy^But if t^e randojfl assignments hrf^made nfc 4wo groupsh 
* beginning brftne experiwrental year, why can f t JJie t 

tesjfing? -V ^ i " ir * "^ 

^ 2 ^-,* ^„w.^k have to t]ie data t® 

sttle start <g 



uiy'alent at the 
^4Htiited^o the Wrftig (or pdist) 
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hange— the, very things ybu measure: as the k 



8^ 




the stafl are' the ones ^bjj^e^ 
files/ While th|t f s < suffici^R : 



1* 



reasq*', this 
^akes*asses«i 

latively a 
irkerest is fn 
essential 
II 





That's very convi 



random assignment of ciassroorps, rather than individuals, 
.of t he s tiyly e vfrn more important. We're talking about a 
[tos; J tjl "experimental and^l 6 control. Finally, even though bur major 
ftfcof the analyses 'involve group-level aggregations. It's 

^liLssVssignmerit to teaching methods has resulted in _ 



3 didnt 



I have a di^reg^fc^^ icind^of proprietary interest in th^rogram. 

Vi<Jl other Mople^v^ yWCfWgl on It with me, but /(pell, it really was my idea$ Anyway, we have 
carefully coristrafcted^oqt 'of several elements which f it_ tc^ether just so. And Ifgt the feeling that 
this evaluation^ lool^hg ttkwe piece} then another pieqp-sort of pulling the wings &ut: and looking at 
them one at a time. J juft WBnder how it's going to arrive at a picture of the whole thing— it's an 
; organic system, not just the^wn of its parts, -FThink you're planning to look at the parts— the individual 
; pieces, bi*yio% at the - jriible. * ;c? ■ ->.. J > , .1 

Compass: That's a very'goo<i questibnr^id it's- one I sympathize with. ItJo think we have a wajfcof getting at 
the program as a^hoie, but it irfy^a bit underplay^Lin the proposal draft-you may hot haver 
noticed. A^IC we Have two wa^ d&handlihg it, one more quantitative and one morXqualitatiye. 

the quMtita|ve way— although we pBato measure a number of variables individually and one at a 
time* our ^nilysei^birt^e limited to looking at thgiii i individually, at least not all of the analyses. 
Many analyses wilKuse multivariate statistical procedures to (identify pattern* of variables. That iSj 
we'll try to recrek^stati^iqal^ the complfe^ty of the programed the program's effl^tk Buftit*^ 
really appreciate the complexity,, we recprnmeffit classroom observation., A number ; oPsensitiV^ 
qualitative observer* will go' into -the classrooqpoand observe the general aspects <#*heir social 
structure, at^spherfcinterat^^ cftc.^itially* these analyses will be done inflSl^ently of 

^he more £tatteii<^^ tvto Sets of analyses ^ be looked at together. ^ We expect 

>_ r tl^tt^ qualitative analysfe will help add fleshy to ^ the quantitative onfes, help us in their 
int^ejWtations, and Tielpjdentify new .variables and new ways to investigate the^uantitative data. The 
J^F* quantilWfive data will similarly provide aft empiri«af%nehoridr the quiMtative«^culation$^ Each* we 
w hopjj, will strengthen the other. % ; r 

' Raven: OK, here's another qu^gon: Your proposal .stresses assessing the adequa 
implementation, process pj^j(TOment,4fnd the like. I>thiritt I understand the p 
mean, &*s nice |o Icnow that people are running pro^amdscig^t and all thS£". But, i 
? think I ftave a pret ty gbod idea about the prqgr§ napsmy, and in tm^Secbrtd place, ih 
can do about it anyway if*it's hot being run cbrreetly, is * 

u Sizer: there are both' program weapons and evaljjatioQ re 
implementation. You can cTO something jirbout it if: 

o K Arxr-v^m'ns-* V^ii U_.._ J 2 



kpow about what's happening. Y*u may have 
*tjfeu need much mote specific information 
antitativ^ljata. 






I gue 
r d be stu 
the program 




idea 

tte that, 



doing a careful assfssm 
ing^done right. That is, 
howWell thS^rograf* is 
sense of knowing the c 




had^a misconception. I thought tffit one^pu s^ 
witK what you ^et^nd thSt you Wouldn't use the r< 
tfie middl^^f tHa«v*lufltinn_ - r* J 



the middl^bf tKe^Valuation. 
Sizer: What you had was only a partial misconception 



j 

Raven: 



1 don't khcw*wha^ottmekn, but it makes #fe feel bettkf. 



tSi program 

done, bujt 
iratorn Jt6 the 

\ ^r%- ' " 

' '. ■ ■ ■ 

^a formal evaluat k>n r o< a program, 
earch results to alffer the operation of 






eU 7 ifi 



. — m „ A - it**--' ■ 

'_/, ^ ^ - * { ' \ - i , 

BiS^JoO>f the.prdgra^i leads you to believe that the initial program plan-may Se / 
wd or twee of the progwrm elements, even though flfe^jre beirS imrttementNlhveU, [ " 



^rri&^and * . 
fould be alter 
J^the prograi 
evaluation), bu 
first bp dj 
y so m^feing 



'^8 



•dropped, jrou should 'try to restrain yourself. Makir^^uch changes might, be k 
Tthough it would be difficult to document A tha,t jt was, short of doing a <aMond 
jfuld be disastrous for the evaluation. To evaluate a pr^ram, th^program has to 
lilable. -The process observations help in the definition. But if the program changes into 
iffecent half\yay through, the evaluation cannot clearl^yenerate information i fcbqut the 



^>rogram^ter th^ ^tang^ as ^stin^ish^ fi 
better iOMoie your> Jde^for changes in the ^ 
and thenrto test thejtflater, in^ML e valua te 
results of. the evaluation of the initiaTpfogrehp). 



ogra 
of* 



ifif tSe p^d^amlberor^ the change. Thu»t it would be- 
^tney occur in the course bf doing the evaluation, 
o^ram (which would! also be informed by the 



ERIC 




82 



91 



- * 



- • -i . - ■ 1 ■ r ' - '-i " " 

• ;- • * * , • i ■' - ^ ^T-.-V f 

^But; If the proce ss observations show that the program is, not being imRlemenfed^ 
planned, it is perfectly permissible tq bring this to the attention of the prbgram _implementers jaftd to 
try tjytet it changed so thtt ft becomes adequately *implemem^ 

inurt^m^ted, it is hot the program which is being evaluated, but a .distortion Gf the program* 

Ravenfi^^does that really work? Is it really *£bssible to > trairj implem^ 

similar* *nd equally adequate, versions of the prograrnfc After all, people vary, their skills vary* and 
^tfceir. temperaments vary. It seems almost impossible.' ;#rid if it is impossible, what does that dp to ybuf 
' '' .ifjjjjft Utile evaluation-designs? ' _ 

Sizer: Well, you're right, that can be a very serious problem. There are some ways bf.ligndlirig if. -But 

before I go into them, tell me* how much variation do ybii think there is in the way the testifiers ^ r 
implement the program how? ' 

Raven,: A great deal! All of the teachers are volunteers* of course/but ~&tttt\!^*y*^}& great differences. 
A few of them seem to understand the progreyh completely, are very interested in it, and da it very 
°- well. Some others work really hard but, don't quite seem to get the idea. Arid others really show a v 
^ pretty low level of involvement. * * 1 ~. 

* :_ _. :_■ . * t 

Sizer: Have you worked much with the teachers who are less good at implemferitirig the program? 

• * ■ • \ ._ , • _ _ _ - _ . 

Raven: Oh yes, at least we've tried. We do most of our work with the people who want to do it and ar^ .1- 
willing to work at it, but have difficulties with it. With thos^ho are really riot interested, there's hot 
mtich we can dq& I guess what has kept us going is that*the program looks so nice with those teachers 
who do it well, i f q - 

Sizer: It's a crucial problem* arid brie of the major uses I just suggested, is to get * 

useful evidence quickly about where arid to what_ extent md|viduaLteacfhers may be going wrong. Of 
course, an intensive £ntial trailing involving class tryouts and freguent feedback is also essential. It's 
also important for teachers to ha#e a say in the defiriitibri of the program— that is, in helping decide the 
best specific way$ to implement thp program in t» classrc^frw^ Do, you involve teachers in the planning 
atall&'j ' * . . ; : ^ \ — _ / JpL, , 

Raven: ^Wefewe've had a few^teachef repr^erftatives ; Wbrfc ^ith u^ a ldrof 

^Sftfi^bu^ dren%rftuch involved in planning*' % cart se£ that it mightjfre a gpod idea, though. \. 

lizer: I "^^Met'^a^enifia^^ twjL^ place, it will gTBat^ 

fc Teachers know the <H^$l^$n. Anqfttty io matelihings w,<?rk irjt it better than anyone else. You'll find 
that they have r a.l&t of Mflfeful*dedb Shout the B^t'ways to r^ake the program .j^rlcl Secondly, teachers- 

• v swho havd a real sa$>jn defining the pr^raro^ audi U's^mporfaht that it be a real Say fhcKndt a token, Will 
^ -^become cor^mitted'^ U^J^^K^/i 1 ^ ajKtoerythjng they.ean t6 make it work: Teachersftho 
/ feeKthat soTrVp|hing^ Volunteered"— ace^mUch 

more likely to 6£ i'ndiffei^t^^eve^^^^^^ to the ^rogram.gjb^ls. - • ^ _S t. 

Ravens 'Sat makes sens^B^^e^^^a^ of the process %ata. What kinds of data are you 

Hiking about? 1 - ^^^^"^ ll S r V '■ . t 

. 1 [_ , 1 > l^f-- * -- - % ^ * J*' _ " \ 

S\ievi Several kinds. But befofre -we discuss fttem, it f s important <o emphasize that all of the,<lata will be ^ ^ 
k^oj conf identtal at ihe individual classrbortf leveU » Arid the teachers must be fKajje_ aware^f^that^-^W^^ v ^ 
^ -have 1^ mak^yclear that the program -is on tri^l^ ^nqt the ^ teachers. NoW^ all of^het>ro^s data ^em^ 

.from bteervat|&Bof otip kl^or another. theVS^is done by the trainers*. By the-erid U%^he ; trainings ' 

• V peridd,' lliece^should be a* pretty clear id^a of Mm the program, shouH^Upbk when ideaw TO^I gm g n^d. 
1 WOpfe won't "^THiere. When tfte%eacirers goMn<o the classrooms with^he Jf^0«j the /> 

iners r»Us\j make freque^v^^S tb obseggg-th^j^ssrooms as the teachers atterripi to impkmW^^e^ 
-t* 0 pro^am^n^^ffi^o fete^^L^ wil^lfc fairly infownal observation, although - 

W' * w^' d^toL£siB^ an4 fee^lck f^m ^$ aid L in : ^hisjr« 

• Ifc^^^pftbl^ffror^ to teacher, ^t the eyeflfaation staff more forma^ w fne 

^ ^^^^^^^^^^^^^^ ^^"^^^^^^^^ r . _ 

ER^^^ ^ ^ ^^^^^^^^^^^^^ 





n 



Raven: .Ali that M it, but after all ffie fraihing, all 

the visits?* and^H the feedback, there are still going to be differences to the way different teachers do 
tlie program^X^^ — ^ v . . .- \ * . ' • : , - 

Sizer: . I'm sure ther^ iyiilj foo % but I hope that after ail the training, £i?it3£ and so on, at leafct the 
difference! will be in a fairly^garrqw range. Don f t forget, the proces^ data will haveVin important 
research function, ets well as the program quality control function. Ih the-first place, it wUJBgow us to 
- document "fairly rigorously exactly what the program was, as delivered; ButJgid«^(SS8^hat, if 
differences do ^bccur, we will be able to see what these >natura!yyjj^UOns in > program ^ 

implementation Have on the measured program outoome^m«^gght give some ifiit^rtseviderice about' 
whether some elements (particular teacher skills, for^gjljSBy^ 

We could folbw.'thijpl^^ more cintrolled^studies later. : * 



others in producing those effects. 



Strait: I'd like to hear more aboutlplryoi^ going \p dea> with jeachers [ who think :^u'r£ 

ightuse thKjdqta^for that pQrpofie. I jnean, ^ofi're going 
(if I know you guys). Hb W flrt H teacHWs*kriow that th 
anonymous when you coll^e^t? 



them* or that someone 
9 i write reports, publi, 
" identifiable?, ean ytJ 

Sizef_:_ No, we _cah f t/|haRi 
to h^lp them improv 
analysis, we'll need 






y'.wBiiiting 
analyze data, 
ta won't be. 



cxnjf_ wi_ me reyuna. suits wixi ue rcporieu on 

h ^mbJefH^ in t^ms 6f th^ perfornianpe of 
will [mn to trust tis., Wfe^ have their data^eTl 



bus. For one thing, we want the trainers ;tb Have' access to it, asLteafd, 
_^? 1 A Ver y where it is Jieejde& Befltfes that/for^puVpbsis of the data 
to identify all the different kinds of data, that come from the same clas^. 
pjx\ the; ibertf ity of the teachers. won't Be given away- by any of the reborts; Results wiil'be reported on 

„ indi\^idjfea^eachers. Still, at somtet level) the teachers 

tell iWW that it .Won't be used/f^r evaluation and it won't be- revealed. I hope that well be able to 
establish good enough relatiohshii^fwith them aa^that they^vill believe us 'aiuf feel secure and safe with 

Ravin? I think most of the tfeacherS w[U acc really good rapport. But Fd tfke to 

Vget back to the process data aftd analyses for a minute. IgbuVfe' m^ntibned two uses for ijjr Are there 
any others? What, benefits will there *be to the program? ; * ^ " -/t- 

Sizer; , * t ^" * " " " " ' - . - -w * • ■ ^ 





■sonie ^t^ctive^eyidenceabgi^^ ok possibly elimi 

the datA.wiU b^ useful, in othei words,' for making revisions and improtfehilnts in €he pnogram. J 

- * - ' " - : i . A, ' * 

it* bo you really *th)ftk yp^p g^yses will be done quic^y^enduih so that we'll be able to use the 
to justify bontinuin^tiie. f 
results but of thbse folks 



SizifrV 



it's hard to 



KJffc y^ that we'libe able to use the |^p# > 

$tHe. pfeg^fn? I've Saxl experience with' program evaluations before, arid getti^Sie i : 
i folks takes/forever! * ■ fL * v ¥ ' .- -- -."■/ -. .. v" ; ' }:' m * ' „. 

to ma^e g^fifenteS that? involve thffigiCTb^ (like 1 ; 



computer crlshes), but we'll certaifl^f Try. We usuaUy try : tb phase our wbrk r so t)iat ye g^t 



ovferal^ results (quite qul^ 
fine-grained and detailed 




_ly {starting ;With 



yses 



Straits Wh|h { see it, 111 believe it. 



Sp egkfhg of the 



we ought to talk abouti-ifl^bW going to be pro<fi 
^ow is that information gOTng tq^be^ised? 



Siz 



iz^^^y.AQ 
Strait; lfey 





Education 
made at K a 
achnical ^p^ort. gh ^rt rtuimnd 
>1S ^fi^Mbly^the^ audiervc 
iteresFj^inie prefect and a gj 
t^ould be enoou raged, if sojne wa 
distdrt^in translatioA. ' 




SO 



ill be submitted to ys(u_fqr your us^ 
at in addition to a ._wr(lte^repdrt| 
^ If you thi i 3titis necessary, I'm 
J6L 
If 
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ly' descriptive Jlata^ arid rrtove'Sfeter tp Jhe.rpore 

Vari sff^^^tions'afcut 'that tha 
of, informajfiott about this program 



/ , 

the Board oF* 
atib^^be 
ith ybu jjjpr^ 
participating ^ :«r . 
a great 



pare a report fol 
•prepare a brief .pr^e 1 
s necessary, I'm _suiV*bhi# of tigy could go 
rb^ct^find the findinra wi.fi be sent tr^" 
"results come out as hypothesized, there' 





Siilci r &S by btfler Schqbf'flistricts. 
e certain that the program doesn't 





Strait: Now wait a minute.- Yau're 
whom. . Don't tfbfget this is bur pf < 



.42 



re going to be In cbmpii#fe_e6ntrQi ofi What's ^i^ ACiS to* ^ 
e Jtist being called in to d4 tlfe evaluatMtl, So I think we Jl*- 
should have final say *tt all matters concerting intwpretfatign and domination; ^ ,_%y V 

■ " -_ * "\ I ' 

Sizer: I can't agree wifti that. My assumption KaB beep that we wbUJd determine the cohtef8 of afl re^rts ^ 
that describe the evaluation, arid that you would determine the^onj^t i of.i reports thfrt preset or. : ; 
describe the program ^Reports that'jio both wfe could wj^rk bn^^S^tiveiy Jahd* of cbure&^hy, 
evaluation jreport will: need to have at least a brieT d^t^^^y^^jp^eurO* Or, we^ould gefybur 
approval For the pbj»tiori of an evaluation Which .desjcffee 
< description of evaluation procedures, outcomes, and implica! 
our control. 




eration arid goals. But thte 
esponsibility and rhwst be under 



* j^fan't think it looks good when a program appears to b 
"jffig and . credible (especially positive results) whin t" 
*fiSd reported by sbme independent group; 



Raven:* Ttiat sounds; reawnaWe to mew 

^evaluating itfcelf*-' ' Results are more corw 
evaluation is clearly seen to have been dbrii 

Strait: -You've -got a p"bint there. Not a goodtme, butga point If thgsfe are the only conditions under which 
you'll take 6n\his- job/l guess We'll hpve to go along with it; but frankly, it makes me a little nervous. 




Sizer apd Coni^sss:. why? 



0> 



Strait:*:' if the re; 
negative, or 

Sitter: Surely you 



Its are clear, strajgij^fa^^ there's no problem, 

ittle muddy, or *opfn:to interpretation '! 

* \ A V k- - • -- - - . : 

lah't want usHo minimize br^tfcrt negative f^ctfjigs? 




It's wiien the results are 



i%twt: Ohv hewtens no f Blft there are diffe^ht^?yfecof lqoking at ffcings. You don't know ;_the ins ajtg dyts t 
the political machinqlions, thj^specffi^ to set fcff one gr another. Qon^munity 

group. At lfcastT, I would [ want to have the -chance (and maybe this should be/Tohm^istetiJ to review any 
repqirts you prepare anft/fb mrifte suggestions about wbrdihgSvemphases^ and the li 

Sizer: I can agree to that/and even welcome it (since you do have such extensive knowledge your school 
district and y£ur poi^unity), as iorig.as it is understood that any continents or suggcstA°hs are advisory 
and not mandatory, v*jjjpuld certainly consider any of your suggestions ve^ carefuUy^arid seriously^ 
and would probably afcqept most of^em, but tdfen't want to be bound to tkatt^craharSi;. iTiere'sjp.^. 
additional rrikch&nism ydpiould use. ft^tpuTipd any disagreem^^ ybu couW kiclC^ycur bi rn statgmept 



as an addendum -iofeny bf*the reports 

.StrditC TfSt^doesn't com 

, A ■ . ... * 
Sizerf Whit^bbut it? . 



ely g^tisfyine, but I 




Strait: Well, since you're g*^0^ prejgyri 
deciding who they go 1 3) anH^s^idirig therj^i 

■Sizen\ We should do* that tMether. I 

yraht tb^bave, di rec^e<Lffi> y h ich aufJieQc^s, 
t wre jare any r^flts.^ 
> on earliv^no rhajter how 




Wh%t about di* 



^hes^a^y reports^ f th 



sfibl|l<f d^cide^airly 
ana,preparecr at which 
r^are 



. $trait: ^pi 8 ^ J*ti right in princi 
«• the Tmdings^L they're aeeai 
•/ at ^^ it if it doesn't till tnem 

f Sizer: thatjs probabjy-tr 
& select. Those wh^S^ 




e^ but I stHi<| 
't Want to reai 



Rave*n:M tftink tbiy^iscuss^n is Btdittle sii 
the r^sults^ are going totb£ gFfjat! Di 
if ybu dbn^t mind* yd likMo tftrh to 





N pi^oJr%T« p wii've been 
I bur ibn^-raBge intdntjs iw m _ „ 

^sbitie of t]je kids in the ftrbgrafr^sgnie questiorts 



it so ftkm we Ji 

fju^nce dfuayse 



we*ve go^tien^me ppe^ire^I w|d say' from 




;1 xyjld take on the job of 



hy reports we 
l«g before 
udiences decided 



rts y^bn't* arid. ffaha^mN will t>e done 



aveS\to Fdc^esintlSt to the boa 
ell y«ii that? ^Th^y're ^refj 



i>een doir^ 
itude*s<ibbi 
ut sucb thii 



r 



bm— to. 
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eneral 
much for 

iy6;infor' 



now, in the 
assessment. But, since 
fe , _ca^ejq)g£t to* be asking 
ttmr totime in the prist, 
ttion available. to certain 
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peopie. We havenT\d&ne it, but since you'll be collefctuag^mwe thorough and more systematic dat$, you 
can expect to get sucp pressure even more strongly than we have. How would you handlertba-fc? 



Sizer: . Before we collect any data on any topic from anyone in this project— teachers> students, anyone|*-we 
will make it very clear that this information Is cbrjfidential and will be seen onl^V project staff arfi no 
one else Under any circumstances. This includes parents who ask to see data IBfct their children,\and 
teachers' who a^EftGfcSf e %ta about their students, as well as anyoae else. W^ qan take on this project 
only if this is un^rstobd frbfri the start. « . . i < • v - : 

Raven: ThaflS goodie agree on that. Excfept the part about teachers seeing, their students 1 datiu "jf its 
nqnincriminating material, like self-esteem scores, feelings of personaiefficacy, and the like, mightn't 
it hj^p i teachers tb plan the best academic program For their students if they know about $bme of these 
characteristics? What would be the harm? 



Sizer: All of this is personal ihfprmSibh. It may riot 
" §§?_ e 5 tiai t :o assure _conf idCT t iali t y , bot h because At Jncr 

the children can assume that no one who knows them" 
# respect for the integrity of the individual. — 

Straitr My, my! 

Raven: 1 think I'm gding^o like forking wlgSybU . . . except 



what? 

Rfcven: WeS* we've jjad a pretty infg 
guidelines, but pebpl$ hive done ** 
tO^cpm^ ihj ro^ke US Refine the pi 
- 1 — icnts are, train (|"WjK>le bu 

Straitt^Tve been trying to get ^oii.toj 



fe iricrimiriatirig^ but it is ^rivate.r We feel it's 
! ases the possibility. J5f 1 truthful ^resppf^es (since 
ill see them), and becau^ it's a way of showing 




1 



j: 



afitkfree-flowing program up to flow. We've had some gejieral 
much what tbeywanted, when they wanted. Now you're Jgoing 
am very specifically, determine what skills ace rie**ded* wh,at all 
of people : — - N -> "• ' \ 

a?t*fbr quite a while, if ybii rem«mbe^,JPam. '* 



We'll have^tb. be rigid 




Raven^yes^weH it just seems the wjiole character of the thing is going to change 

aila^recise, we'll have to decide/"^ a set bf procedures, arid then riot change for ai.whole'yeaK \Frii 
afre^j all the-fun^s sgbing to go outw it. . * • 

er: Ju^t/thirik 6f ^vfts reachfog j._jjiey.. phagg Jn x the life of A he prpffiam. Ybjj have corftpleted the 
experi^fental _p^asei developed sop^ at some irttriglUajJ 

jf hypotheses, ^ow you've reached'a point where ttiese procedures and hypotheses can* be pjjt to the^test. 
To:' do that properly, you have t6 f k$ep w ^ref ul^coritrbl over the d^fjnUj?3rr\bf the ^tetnerits of .the? 

^A^ayS _ iK*^ arid over She _s|>ecifics of; their 

( implementation* It may not be the s&me kind of fun youliad when you were first developing the ujeas, 
_ arid procedures^ but ideas ^and hypotheses are worthless 4f they're nbver put to the t^st^^.^ ' . 

taven: I J&^etsjjOTd that, in a^ay^b^P! l[an f t £elp wonfeting whether by standardizing and routinizing the 
1 pfrcedureTOW withH data collection, we might be stampin^sput the very 

le merits tfiaVitiay have be^4nogt : iinportant in making the project successful (arid^ as I told you* I krifcw 
when it was^small tad experim£ftta£-the enthusiasm, <ttie exeitemeht, the uncertainty about 
tfhere it vtais leacRrig. * L ; # v ^ f 

Sizer: TWell, in ^sense? thbsewe components of thje program, along with specific program activities 
prbgram^Wr 

be possible W do the pfodgm rigorously and cocnpletely without elifninating^tbese "emotional" qualities. 
Xtexaerhber, mbstJ^^r^^ haye been doing this for a 7 year. If the thing is handled properly, there J 

isVoo-c^asbn why. fhS$r shouldn't be as enthusiastic and ^xcited^as last year. I think i wtet^you have 
expressed is an impbrtitit gSrcerri that we should all be aware bf^arid try to- take steps to cqMntecact. 

■ \ . *jn , . : i> - & ^7. 



9r t 



^lem 



Strait: Well, I don't seem to hfc 
Rav^ni No.: 

Sizer: W^llj.well rSfirie the evaluation 
^. wKy^6fi f t w^^^tpgether afairvip 



r 
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in corpora te^fb me bf^ the things we discussed today, an^then 




eeks? 
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fe we go again. 




_gion . More than a year later, late springs. Conrad and Allen have just eritefretf Lacey's 
office, carrying severS^apies of the evaluation report*. Our four characters sit kround the conference 
table, ready to work. 




Raven: From our. talk on the^ phone, I know you have some good data for us; but frankly, I Qiij^M 
' understand what you were talking about. I got lost when you mentioned statistical interactions'^* ' 



Sizer: bjf£, let's tackle that one, by talking about the board's primary concern first 
To clarify it, Til make notes on the blackboard as : we °gb along; * 



achievement results. 



.including (wntirig^dri the board) 



v Factors 

^Schools 
Grades 
Tracks 
Methods 




As you remember^. We have a design 

J 



So, within each of two schools, we have two gtedes (fifth and sixth); within eacTi grade we h^verftro 
tracks (high and lowH withinjeach track we faa^e-twa xnetjjbtis^ Well, our question, 

is—does Jigsaw improve academic achievement? Ng^that.^ 

-<* *w*_v: ^^-k- Our goal is to find out* statistically 

Let me lay oirt the possible 

on- the board. My laziness compels' me t<rajse Jine abbreviations S, ■ G, T, ahd M fa'r^he ^ 




T 

M 

f ^SxG 
j§x.T ' 

Gxt 
GxM 
/~TxM 
SxGxT 
cJW 
SxTx* 
GxTxM 
|xGxTxM 

.m 

fa'dththk it,' 
our .findings SHS^ 
dif f ere rice, in file-' 
you'll lpok at ft^e.^TxM 



' ajfstigsaw cflp^^^With all Control Class a^-but 
11 the other factors* Th^e 1^ statisticagy ^sffiilfcan t 
controlliSjfoir thefpf " 
this is also^ miffeariy^^ 
that the effect' of the method differs ^tfe^n ^he ^$cks, or^; th^:jargx 
effect. EVen ^oigfrth^lJigsa^ aifa) whole dif ^ed sigrtificarrtly fpom 

difference }s due to the imprbvemenT^f tije lo w-t pac k / \H£sa w classes. > 

Nbw 

say, 

^ontrol*classes 
5britrbfs." 



u can see thfe ^necessity for te 
f s<?hooloi of gra 




: using ANCC5VA), but, if 
s t gays? in. siropj^^rms, 
there is ah interaction 
^j^pjfitrols, mo^ei- the 



ail combinatioiji of factors. ; So yoacan -go, tq'.the^oard and ^ 



High-tt^^ Jigffiw c a^es had 




saw classes in the low traclf: scores b^her tbdn. th^ir 
scdres - which didn't differ ^igr^icaht^r f*om 'their 
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St " trie?. " %SaW imPr ° Ved academ,c achievement for "the low-track classes and dfdnt affect it to 



the high 



^ri^fh^k J- 6 r « su i« for self-esteem are more straightforward,' The only significant effect was 
for method. That is, the Jigsaw classes had, overall, higher scores than the .Controls! * ' 

• - ■ . f 

•Strait: Regardless of the other things~er, /actors? 

SiZ ^.S -SSirSS^ -2 ors :^aU the interactions Were hot significant. So on this, you can simply 
tell the board, "Jigsaw improved self-esteem;" ; . 





Compass. ^ Weld, we did f itk! one difference in actual behaviors. When we went back to the school records 
and checked attendance, ^tardiness, and disciplinary actions for the last 4 years, we\ found that the 
Jigsaw classes had significantly fewer disciplinary actions this year than the Controls' , 

Strait: What about •attendance and tardiness? 

Slze.r> There were no differences in either direction for either of those variables. So on this- 
Raven: We can tell the board, "Jigsaw, reduced disciplinary actions." 

SiZe vp5r^n ! fit ai t d H that they £ ew er-than the Controls-I didn't say that they decreased from previous 
, years. In fact, they increased! Butjbey didn't increase as much as the control classes. *• 

St - ai i«,,.«!S^ d ^P» n 1 a5le - As students get older, they tend to have more disciplinary actions. What 
youre saying is that Jigsaw reduced this expected increase. . 

Sizer: Right! And that's what you can say to the^oard. . 

Compass: There's another important element to this. Remember that we have to Consider as many plausible 
SIS 8 ™. 6 "yP^t/ieses as we can. Let's suppose that Jigsaw teachers didn't make referrals for the same 
.disciplinary problems, but instead handled them in the classroom. To consider this possibiiriyTWeaES 
support this!* nonCla9sroom related disciplinary actions. We got the same results. And our observations 

Raven: Tell us more about the observations. We certainly got a lot of help from the immediate feedback 
^mSSSl^Sm^ the ^mentation of Jigsaw. Some of the teachers improved trelnendouslj 

. " * ' s; . 

COmP *??: t £ n m t L P *T e * SaW ' ° bserv ?'tional data supported the significant quantitative findings. But 
r~ 8t ' they've provided us with a wealth of information in three general areas, as they relate 
to the Jigsaw process. They are training, teacher, and student Characteristics. The details are covered 
m our report to you, but I should comment on the highlights. The training wluld pa£3^|£ f|h?artSd 
«Im ^h^ 18 ^^, 8 ^ fOCUSi ° g °" teacner versus student control in the Lssroom TOs Sue 
ISattftheoteVvers!" 6 ^ implementation » several- of 'the teachers said exS 

f *# 



Sizer: That ties in with teacHer characteristics.- ]{ 11116m ^ Ult 

? e ?. cbieved by fa * in * something like authoritarianism into account. 



testing. 



might be ^that better training arid teacher selection could 
But that's a hypothesis for Future 

* r : ' v 

Compass; Arid another one that really interests me is similar to the question of tracks. We know that low- 
Tafct^ compared to high-track classes. But other student S££rt& 

may cause effects. What about girls compared to boys? Or, what about differences 



in motivation? 



Sizer 



thJv 'SIfS!Si ii S T: tn ^. w ° u . ld need a sizable grant to get to that level of detail, and right now 
they. Just want to survive. But it is important to note that the observers saw significant differences 

ero^hal^d ^>7, en with »\ th , e same classroom, and that one of the Wiff^^£^^5 
groups that had more girls seemed to function better. 
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"srsjfft: OK; we^iaW 
' . presentations to ffi 

. Fourth "piscOssipn 




wondering what h " r ** * s»j _:\ „ 

Sizer: Hello* •.. : . / ■ ' ' *^Jff * . ^ .4 : + 



Raven: 'Conrad? ■ , .Is, - < ^; 5; • • tt V ■ ^ ; >^ r '»s. : :«L :, : v -^.f* : ' V C, 

Sizer: Yes?— Oh> fli ( P&ifu we ! ve<beetif hoping 'ydlli'd. call. In fact, Attert fcaftpfens to tie here right now— )mh 



ser: Yes?T-Oh> Bi P)am^ we ! ve<beetif hoping ^dti'd.call. In fact, Attert4iaftpfeps to tie here right now— ^Thirtr 
0 get ort the other line. . -/V'^j+pJ?* ' '\ ? A. ^ • ' !f'^ : * ' tv:^-- rJ?'.- -u- 

Raven: GK^Laeey's on an extensio^KeHe. 3S^;» * r *K * ^ 'A 

Sizer: Great—so how did it go? ' '^^ ' ' - : 1 ; ^''.v.;'/' ' ? • ^ *V_ 

- . •- : * ■' •^'•%L^' ' ^ *■ * 

^Raven: Teigible, we didn't eyetf'get a chj^de "la present it tc^ "> _ ^k^'.?': '■ " 

Siz|r:- What! What happened? * >' •* V v: V- 2 ^ ";^V,*5ti? 




Sizer: I don't believe it/' Buttiw^n Sid ybU*ear^ .j-,-. ' * ' : - ; 

Strait: The board president phoned ddy bejfofe'yesterdaXi. laying their budget /dommittee ^^jii^^^^0it _ 



the ^ latest fiscal ^ year ^gur.e?, ftnd.tiTerei waB.no^ Wi^^ey ; /cbUld c|ut€in|i$ _ ff ^t^d^> Jt j^&blifer trSffi^^dteii 
Jigsaw or anything else ^ — - . - . V ~ y ^ J>:tf u r^&?;5>" i ; ' t ^^'--L 

Raven: I'm s^ depressed. I spent the ^hbl<* dSy-. yesterday letting the Jigsaw teachers know about tte ^ 
board T s^cision. * % 



Sizer: .Hang bri a n^iiiiite^ Parti. I wartf to hear about that also, butJ^riifce to know the whole story qti the 
^ board first.. ; M : * y ? 



Raven: Right. *I f rh just still angry . 



Strait: ; So\tbe president ^$aid she wfes sorry but' didn't think there was any point in making a board' 
presentation if the decision was already made and took us off the agenda. 0 

,_v,!... - . ^ ' . . • • - • - ' ; ' • 

Compass: And tha4 was it? * 9 * 

Raven: 'Well, maybe one or two glimmers in the gloom. 

Sizer: Like what? . * I - 

Strait: The president said both she Qnd another board member— what's Jiis name, Pam? 
Raven: Lengenfeld. 

- ' 11. 

Strait: __ Right.' I can never remember -him for some reason. Anyway, she and Lengenfeld had both read the 
fyU.evataftip^ fMjwere quite interested in the 

results and might try to help us find some outside support, foundation or whatever. 



— Y- 



Raven: But how real can that be? - = 

Strait: Weill I'm not sure. It may just be a_ bone to soften the blp^ but I had a feeling thei*e may be some 
real Jntererft t^e^ the* thing and it seemed to have 

gotten her more interested. She was asking ail kinds of questions • • ? • 
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r: 'i . ■ ' ^ -. <" V W 

v * cCbmpass: Hrrtmm^ tjiaVs something to consider. Fm still P_eacting,ft)yseif . . . i. When I think of the' hours wfc 

• V 5 • % put into.i^ v vto say nothing of your tirr>£, and the teachers-- i|<s just disappeared down a tube . . . . 

Y ~ *** , - \ ■_ ' . - ■ ' ' ' • *' 

;. « : SizeH How di! the teachers react 7 / Pam? : ' * 

♦ *.^ v w:^Ar^ jfctiihk about ft— mtfybe. that's the other glimmer. Everybody was 
. : %- : disappointed, of coursp, but;the thing I found interesting is^Tiat two of them— you remember Nancy and 

V * 1 * 'Doug frbffi the 0 school sixth grade? /; ' r 

>- - . ■ . ;,;^> v$ . .■ 

*S ^ iie * s "Righj— .t|he two Who were always asking righteous ^Uestibn^iboUt bur evaluation design. * 

; s ^^r_J^ they came to nje at the^d ofitfe day and said they had been talking about it 

. 'and^m^ybe tfiere was. a way the cUrrejlt Jigsaw teacher group could get together and do some ih-hotise 
JP^* / ..training next year. - p.. * ~ 



/ vV : "Sfe'mpass: TTiat is interesting. 

:Strait:c Wfiat's so .disappointfng to pe^ssdmehow just as the evaluation seemed, to be actually/ helping 'v 
* » • v- increase interest, thejMJg get$,p 4 Ulle<J but, * * • 



>lete loss, t The two of yoU Should be\ 



[• V ^izf^, **nbW._ w&athjpfcuig the same thing, but m^iybe it's not 

" thinking about.hdw to ^Ujld, 6ft what the. presided and t eachjj j^id. 

* >s* >: :J?^- n ^ - Believel; me^ fanl. Pm getting *all the Jigsaw teachers together next. week \o talk abdut it after I've 
'..y' had— atid they've had->-a little more time to think a^but jt. - - « " 



; ^^iaer_i jWi} wfiiit-Jd think about it, too. Look, I'd like to talk some-more with you in a flavor so, but Allen 1,3 
1 -'j ^ nd 1 ■ meeting this afterhboh we have to prepare for. Could we g^t bawk to you? ft." 

Strait: SUre. Ah^ there was one other thing. When the president called, she mentioned that she didn't guto \ 



understand dne of the analyses in the report^ At the tyErie, I was too hot to even focus much on what she 
was saying, but suggested she could give one of you a call abbyt it. \ ■ ^ 

Cojnpass: Oh L maybe she was. really interested— maybe^we could interest Her in our coming in fo? rinoth^r ' 

r valuation. "\ _ • 

- - - ~ • _■- ^ 4 

Sizer: Allen! One step at a tirtre- : We arid they've both got a lot to consider. If she catis, she calls, biSt let's 
sort of let it sit for a few days. , - .« ^' ' V 

* Strait: Right at the moment, if you mention the word "evaluation" tp me, I'S likely to see red. 9 . 
Raven: Evaluation?! Nevermore— 

Strait: Pam", please— I thought we agreed you'd stay off that piiri^ ' ■ V , ' r 

Raven: Oh, sorry. It seemed jusf right. Anyway, we'll talk to you again in a few days. 
Sizer: Right* say Thursday. - V 

Strait: OK, we'll call in the morningl i &i 

Compass: See ybli theri^bye. 
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ONE SUSPfeNSEFUL MELODRAMA 



Critical Moments in a Media Campaign Evaluation 



i 



This vignette illustrates a number of problems that program decisionmakers/ and evadors encounter (n . 
theiUsUSl process of program development and evaluation. Every problem that ajtses-in the unfolding of this 
drama is shared,- although both primary actors see each problem as their owij. Even more, mariy of the 
problems are seen by each as being caused by the other. - . y = 

• As the drama starts,; th#4mmediate problem is a' time constraint, caused by a change in the theme of 
the prevention me&B&ftfjilii. Wit time Is the fundamental resource, and its limits increase the awareness 



of conflicting and un< 
Characters 




jm V- m -J» W -zz A . _ ~_ ! -~ ' A. _ * m j*ljA* r\ W\V*f\/^t til' 




b Beverly LeBeau. the young founder. of LeBeaU Associates,- a media production company specializing 
in public service mass media camfBigjs, and project director on the State-funded media campaign 
for Project Straightalk,- a new, thi;ee-year alcohol c abuse prevention demonstration project. ^ • 
0 Walter Stauback. ' A program evaluation specialist and project -director on the separate State 

contract to/Gonduct a"third-part£" oyfcome and impact ^valuation of Project Straightalk. 
6 Alic? Sla^ack.. Walters wife. '\ .**•'.,.. •* ; j ' ? 

Beverly LeBeau walked.into the staff lounge arid flopped into the armchair, saying to two of her key 
■ people, "He didn't lo»k too lappy, but I'm going meet with him again tomorrow." 

Beverly is. director of Project Straightalk, the new, highly .publicized mass media campaign to prevent 
alcohol abuse by teenagers.' Beverly has already produced four public service media campaigns, of them 
on drug abuse prevention. She knows how to deal wjith the many people who cdn help or hurt a project like 
" Straightalk. She kno*s how to manage tight production schedules and budgets. And she des^ns effective 
meSa products-creative, hard-hitting spots that grab thaaudience arid deftly deliver the message. Beverly 
.strifes to meet the commercial advertisers on their own ground, with high-quality production values and 
- messages t+iafcspeak to people. "\ I . . ; 

Beverlf a\>Xides herself m>b&$ a realist. She is resigned to the fact that public service money 
• comes with many strings anS that a 'big part of her job is keeping her projectfc from becoming entangled. 
IfriShTaik feSLe-funded Sirough a cdftfaet between the "State Alcohol and terug Abuse Agency and her 
"media production shop." itfe contract requires that sjj^deal every day with bureaucrats advisory groups, 
evaluates, and other pains-in-the-neck. " *ut knoW/figlWe are no "free lunches" in the public sector, 
Beverly is usualiyable to stfy philosophical. Sbmetfimes oil a particularly frustrating day, she ffntasjze? 
about Michael Anthony appearing at her door with a se^eri^igure check and saying, HBeverly, grt go db it 
trfe way you know 'it should be -done." However, . Beverly knows that the work and the shackles are 
inseparable. • „ ' ; ' 

: Today promised to'be one of triose balS days. Beverly .was not looking forward to- her 'first major 
meeting with the '•outside'.' evaluator, Walter 1 Stauback, since she and her staff had decided to change the- 
campa|n theme. Like Beverly, Stauback had Written a proposal -in response to a State request for proposal 
and had won the evaluation eontract. That contract, was huge, almost half the size of the 3-year, $950,000, 
. media contrast. Because of its size, Beverly knew ;that the State was serious about the evaluation. 

- Five -months have passed since both contracture awarded, and for different reasons both Lebeau and 

- Stauback have been under Stress during- those months. Beverly has felt the pressure to firm up the campaign 
, feme so that scriptwriting and production can proceed on schedule. This means constant coordination of 

\/ t he creative process with&market research and the project's advisory board. The original theme, the one 
-that had been presented in the proposal and had won Beverly the contract, bombed in the early research. 

- - Small- groups of carefully selected teenage volunteers had been brought together to discuss the theme 
" -!'Alcohol-is a-drug!" an* to see rough storyboards of television spots based on the theme. Beverly had 
• ^eveloped _ the theme; after reading surveys showing, that many young people regard alcohol, and beer in 

- particulars- a natural, innocuous, and harmless way' to get high. "What's the problem? It's only a beer,' 
was the attitude suggested by the survey data. In the proposal, Beverly had written, "Beer is regarded as the 

• psychoactive equivalent of a soft drink by a sizable proportion of American youth." . . 

i * THe .young volunteers in the discussion .groups, "called focus groups, yawned%t both the. theme and the 

* storyboards. Instead of Responding,- "I didn't realize that!", the teenagers reacted with "Of course, or So 

.-. " i v ? * ' -91 - > • S ; 
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what?" The beer drinkers in the grbups^everi those least experienced^ just didn't believe that they were 
J^HBL a H? _ se rj° us risk s._ __^eir own ^ them that they cpUld drink without 

encountering trouble^ And the nondrinkers; what few there were, already regarded alcohol as a drug; "I don't 
heed a crutch to have a good time" was their most typical response^ None of the kids seemed torthink thai 
Beverly's theme would change anything or anyone. 

The State's reviewers*, arid. later* tfe pp<^ect's_advis6ry panel has-.endorsed the campaign idea. But the 
Rids had riot, and it was the kids who »UnteJcJ._ Beverly had hot been too upset because the theme had 
subsequerjyy proved barren for developing a good campaign. So Beverly _and her staff had closeted 
themselves for 2 days and emerged with:a new idea. There was no time or money to test the new theme as 
t be old one had befe^tested, but Beverly had learned a lot about kids from thp earlier Focus groups arid §he 
was at>s °j u1te ly ^ convinced that the^new theme _^ouJd work. Besides, L she Jahd _the_ staf_f_liad_ hit _upon a 
tremendously exciting fbrrpat for the TV spots, one that would deliver this message with great visual power. 

Beverly's project officer at the State Alcohol; arid Drug ^Abtfse Agency x Molly SqrehsSh, Tiadh't bjen 
enthusiastic about Jhe revisions; she wasn't sure that all of the projefct^ goals would be directly addressed by 
the new theme. Beverly persuaded her ib approve the. changes by pointing but thajt the project'? timetable 
would ha^e to b§ revised if^f farther delays' were encountered. - Since the beginnings .Molly had emphasized 
that the project^ust produce-all the deliverables on schedule. -Beverly Was even i able to _persuade Mqlly 
that another meeting of the 'advisory, panel, wbujd be art unnecessary delay^he advisor's reactions to the 
revisions could be more .quickly and efficiently gathered via the mail. 

Only when she "had gotten Molly's approval in writing did Beverly call Waiter StaubackJo tell him tfiat 
the campaign's theme h§d been revised. Walter reacted with understandable anger— he had dpent many hours 
with Beverly clarifying project objectives, monitoring the development of L_scyj>tv and discussing the 
evaluation plans to make sure they would be. responsive. Waiter was also under the gun. -He wanted the 
Protest questionnaire to focus upon the campaign strategy^ so a "good deal of Kis' work thus' far might need 
redoing. But questionnaires had to be delivered to the survey firm within 10 days. The L Pt^est survey was 
scheduled to begin iri_6 weeks in both the nearby experimental city apd the highly similar comparison city on 
the other side of the State. - . : . • 

As a gesture of good wil}, Beverly had offered to drive the £6 Jniles to'Walter's office to explain the 
changes arid to help deter mine Jtheir implications •for the evaluation. * - 



A few minutes into their meeting, Beverly realized that St&uback was threatening her stereotype of 
evaluatbrs. Even under the strained circumstances, Stauback laughed occasionally. He spoke English and 
D°t just "Research^ ijew ideas about the carripaigri. To her 

surprise, Beverly found herself enjoying the conversation. «;>•■'""• 

Stauback: Let me see if I'ye gbt_tfiis straight. You're saying that now you want to put across the message 
that "Alcohol is for losers. The only way to be a winner'is by working for it." ^ 

' - ' <"< 

LeBeau: That's the basic idea. It's time to stop dajicirig around the critical point, in the long riirij; tfie only 
way to really feel good about yourself and to succeed is to work hard at the things that ¥.re important to 
you. Maybe some people will say it's puritanical, but it's trqe. One of the hidden -dangers of regular 
drinking is th^t it causes kids to waste time they could [ be spending stretching themselves in some way. 
It also undermines their ability to push themselves-. l Ahd fob many kids rationalize that beer is OK* 
thinking it only has a 'little'' alcohol. 

Stauback: So you're prim^ attitudes toward beer, especially their percept ions i of 

the coats of using it— costs to their character and competence, not physiological tfr legal costs. At the 
same time, you want them to get the idea that personal success arid satisfaction come only through hard 
work. f ^ 

— _ _i — _ v 

LeBeaU: That's right. The message has two components. If possible, we won't just be telling them, well 
ajso be^hgwifjg them. .There's hot much kids j£et loaded^ but hard work comes in 

many fqrhis. Athletics, arts; scholarships, business— there* are plenty of paths for-4dds to take. Showing 
hbw a kid cari work hard ifi brie of these directions will be the positive side of each script. Contrasting 
f}?^ working kids with kids drinking beer— cutting back and forth between the twcF^pits the positive 
bgainst the negative. In each spot, hard-working kids grow^weat, hit and miss, progress, achieve 
• something and feel good^while the i drinkers continue to cruise, listening to music or playing the 
arcades, complacent, stagnant, falling behind! ^- , - 

Stauback: You can show that in 1a 30-second spot? 'V 
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fceBeau: I think so. It'll be tricky and tight* but I think _so.__We can do it w|th the TV spots, not the radio 
spots. Radio requires a different approach—same message, but we will bave to tell it rather than sh^w 
it; • 



StSuback: What about the other objectives? What about knowledge gains? And which" behavior changes are 
you looking for now— reductions in first-time use, in experimental use, or in regular use? ; 

*♦ - • - 

LeBeau: I guess we'll have the biggest effect on abstainers or kids who have just started drinking. We've 
N read the research articles you gave us showing that kids who already<drink regularly aren't influenced by 
mass media. / 

Actuaii^,__Walter> L9y?stioiis about other project objectives' had Surprised Beverly a bit* so she Was 
pleased that she" remembered the research studies ^ The truth 

was that for several days Beverly had not been thinking at all about ^'changing behavior," or increasing 
"knowledge/' or about anything except the new campaign ideS and* how to effectively translate it into, TV 
spots; Walter's questions about objectives had reminded Beverly of the terms of her contract, with the Stfte, 
which specified that the media campaign was to "incrpse specific knowledge of the pernicious effects of 
alcohol use, promote greater understanding of the risks and thereby reduce the* abuse of alcohol by young t 
people ages 12 to 18." t 

Beverly wondered for a moment whether she could be criticized for . ignoring the contractual objectives^ 
Legally she was covered^he had Molly Sorensen's formal signoff and had effectively . mjltraljzc^the 
.advisory panel— yet she still felt a twihge bf anxiety that perhaps she had neglected or overlooked sometmng 
truly ijnportarit. BUtl there simply wasn't time for indecisiveness or backtracking, and the new spots wfcrp 
going to be the best she had ever done. . v 

A half-hbUr later, Walter Stiuback decided to cut the meeting short and schedule another one with 
LeBeau for the next day; Walter was upset and he needed tirfie to think. With great ^enthusiasms LeBeau had 
described in detail the scripts for four different spots. LeBeau was a gifted storyteller and Walter had 
appreciated the visual and dramatic impact of each script; However;, LeBeau's impressive presentation did 
not aUeviate Walter's increasing concern; rather, it added to Jiis worries. Walter c«Uld see th&t Beverly had 
invested much time and energy in the scripts and was firmly rommitted is > the new concept; He could 
understand how the hew theme might be a major improvement on the old, but from hi& own perspective the 
new theme did nbthing to solve the complex, intertwined problems that plagued not only the evaluation but 
the entire project. . -i • * 

That night Walter asked his wife's advice, as he usually did when he Was considering major decisions. 
Alice was a wonderful listener. Often she simply asked a question here or there and let Walter find his own 
solution. * - . " ; 

Walter: the biggest, parts of the problem are the unrealistic expectations and ttre ta^ 

State's goals for the campaign are pieMn-the^fey* __Mass media campaigns w do_ not prqduce major 
1 attitudinal changes, let alone J&ehayioral changed. The State people think that changing kids' decisions 
about alcohol use is like changing decisions about which soap or toothpaste to buy. The media people 
do, tod. Show the kids the spots a few times ahd theyll straighten right up. But decisions about 
whether, to ^use alcohol are a lot more complex and hard to .influence iJian choosing a brand of tissue. 
These are not superficial choices life Kleenex or Scott ies^ these are behaviors that depend on dozens of 
* considerations. In the last few weeRs42ye_ reviewed nearly a dozen evaluations of public service mass 
media campaigns and not one found a major shift in behavior. ■ r 

Alice: Have you explained this to them? ' 

Walter: Not really. I didn't reaiize^t until I'd read the evaluations, and I'm positive th^y dqn'^want to hear 
the bad news. And who am I to tell thema&but the media or alcohol use? The media people half 
believa in the thebry that injformatiqn changes attitudes and attitudes change behavior. They also 
believe! "Link it to sex or success and it will sell," I'm hot sure what results they expect, but they 
certainly aren't worries) whether the campaign will be successful. 

Asior the State people, they want to show the legislature and everyone else that they're doing their 
job, Which means changing behavior, I gue^._ They seem most concerned that aH the "deliverables"— the 
products— get produced and get produced on time. 

'Alice: You don't think they have anjr chance of succeeding? p 
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Walter: It all depends on how you define success. The media people cSft get their message across. Theycan 
get kids to remember and understand their campaign idea if they do a good job. Maybe— maybe— they 
r can get some attitude shifts, especially if they $ah keep the message in front 61 the kids for a long time 
and M _tfr*i_<*an focus it on a _§)ecific attitude. And thej may be doing that I 
r wouldn't bet on any behavioral change, even if they have a huge" budget for bqying air t[me^ which they 
- ¥ don't have a hundredth of. they're spending most of their mphey on TV, producing : 30-second spots and 

^buying air 'time, but TV is a very inefficient and expensive way' ta rejich tepnagers. Teenagers watch, 
;.~4. less TV than anybody else. 1 think the^ would get a lot more for fheir money if theyoconcentrated on 

radio, billboards, and buscards. Even school newspapers. -«ir 



Alice: TV^ a |otjnore exciting to them L HI bet. There's one _thing J don't understand. Tou fhink they are 
making bigjmstakes, but really, none of this makes your job harder, does'it? 



Walter: Jt makes my job easier. If*my finmarx r?sponsit^nitjf is _meajwri^ chanjjes in l general behavior and 
attitudes regarding alcohol, ! can just go ahead and finish the pretest questionnaire and run thepretest 
survey without wor^yiSg too much about what their theme is of what jmS particular spots will be ttke. 
M?l^_rjnj£ the ^ if there are going' to be anyi ji easy. I've already gotten ^ 

most of the general questionnaire items I need by pulling them from previous surveys and evaluation 
stUdies. Measuring the specific or imrtleicJiAteveftects of* the particular theme, requires that I know 
exactly what they're going to be_Myi_njj or do jiuptipnnaire items that sjibw changes 

from pretest to posttest in kids' recall or recognition of ihe theme, in specific kinds* of knowledge or 
. concerns, and so oh. Those items I have to write^myself and try out to make sure they work. 



Alice: How can you do that? You're out of timet I thought you had already finished flhe questionnaire. 

Walter: I thought it was finished^-until they changed the theme. Timje is the irea 1 filler here. The media 
* people are being forced to rush into production before they should, andtrm being forced to run the 
pretest prematurely. The State thinks it's protecting its investment by holding us to the timelines^ tout 
it's ensuring that the money will be squandered. . *H 

Alfce: Didn't you know that the time frame* would be tight be f pre you bid oh the project? ' 

Walters _ I Jctfew it and I didn't know it. _When you're _wcjting a.prj^Kwal/you tend to go_alqng wfth what's 
demanded and to adopt the>requester's perspective. You're hungry and you waht to pleafce. .it's 
different afterwards when ybu have to live with the "day-tb^ay pressure. In actuality, it's never #tfs 
simple or smooth as you hope it will be, beforehand. * * _ _ 

Alice: So what are youn options? . \ * ; 

■ »:__ :_ • _!_ _•_ t 

Walter: Obviously the smart choice is to stay on the sidelines and do "the genenal outcome evaluation. 
That's certainly the easy thing tbtfo. The alternative is to make trouble for everyone including myself*, 
to tell the media peqpfe ^ think we're all making mistakes and ^ 

react positively, I'll do my best to focus the evaluation on their final product. But they can't afford to 
listen to me— and I can't afford to do anything either— Unless the State backs off on the time schedule^ 

Alice: I have a £unch you've made up your mind already. * ' . 

Stauback: Yes. Maybe I'll open all of this up with LeBeau tomorrow. * ' • . 

The next afternoon, Bever^had two reactions to Waiter's concerns. One was irritation. She just didn't 
-baY? time to dejsQ with this, eve^ some sense. Kit she wp also surprised and 

impressed fhat Walter cared enough about the project to have wrestled with these issues so* seriously. 

Y * - * ... : • 

LeBeau: __Lbok, . I'll _be_ strg^hJt-..__I_thii^y6UVe gbt some Lgftod fK>irtts, MtAbat ybU're i way offbase on some 
others. But really that's irrelevant. We just don't have time' to redesign anything. And you don't, 

either.. * < * » : . _\ ^ ° r _ 

. St aubactc: You're right, unless ,we can renegotiate the schedule and the^deliverables. We can go tb^Mblly 
together if we want to. What have we got to lose?. 1 • ' 

_ _____ — ; __ — * «•-- - - - - > . 

LeBeau: A lot. For one things the time you and I take discussing all of this, and for aQotYiec, thg tkne we 
spend talking to Molly. Not to mention the dues we'll pa^one way oc the other oyer the next two ahd a 



half years for scaring her and helping her* to see that things fie more screwed up than she realized. 

it Fve si 
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Stayback: Maybe so. Maybe so. At least fell me your reactions to what Tve said. ' 
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LeBeau: OK- : Hi make it short and sweety and we can go from there. *" 

i . 

One, You're obviously right about the time crunch. I need the extra tirrre'jUst as badly as you do. 
.You're absolutely right. " «% ' ; ' * J 

Twb.N I guess I don't really believe that this project will produce behavioral change by itself, but I 
;do think it will change attitudes aftd awareness. And that's a significant result in my book. Even if the 
Campaign affects only one or two of dbzens of factors, maybe that's worthwhile. If kids clearly or more 
deeply understand the' risks of drinking-that's important. It may not pay off behaviorally in the short 
run, but maybe in the long run it wilt Kids don't really understand the type of risk we're focusing on. 



Three. t don't think »we ai^-felyfn^ too h,eavily on TV, although HI admit that ^TV's where the 
professional payoff is great^Tor us media types.. Remember, yve can aim the spots at who we want by 
putting them into the rigHt^ows and time |lbts. We're buy4flg air time, hot asking the stations to give 
us public service time. (You know, the 6jOCUa.m. and 2:00 a.m.vtime slots. >We '11 buy time on the 
programs tjfat give us gr^test "reach and frequency," which means the greatestTnulrtber of exposures to 
the spots by the greatest mlmber of kids for each.dbllar we spend. 

mere's another point that you've got to ufidenstan* about TV! Wfe want people other than, the kids 
•themselves to see the spots. We want parents, older brothers and [sift^s, teachers, you name it, to se^ 
the spots. We want *the message talked about in the home, in school, wherever, and we.jvant it 
understood by everyone-^-so that it wjtt be supported from aLLsides. 'TV is the way to get_ people talking 
about something like this, because it is the mass medium. If we're lucky, and if we handle this right*- 
the TV exposure wll stimulate spmf rieW^aper and magazine coverage, jnaybe>even_some TV news 
cpverage^Ubhcity that will be priceless for spreading and supporting thS^message. So don't seirus 
short. TV is the way to make a lot of thing^Tmp^eri. ; 



Four. I do "Want you to do the specific evaluation. We need that level of^precision to know what 
really happens. It makes me a little nervous, ^but^'m deeply curious to know how much we really get 
across to kids., I sure don't want to put all our eggs in the behavioral-change basket. 

Staubackt I heed more time if I'm going to do a specific evaluation. I'll need to know precisely what ypu¥e 
- going-to be doing all along. You'll have m^lbbkifig over your shoulder for 2 more years. 

LeBeau: I understand. .That's OK with me. And I know that I'll have to delay the start of the campaign so 
; that you can firtish the pretesting first. ^ 

i 



Staubackt Let's go see Molly. 
LeBeau: -Let's go See Molly. * „ ■ ^ ' \ 7 

\ ; ' ; v- . . . __ . _ . \_ _ . _____ ____;__-__' 

Working as a team; Beverly and Walter v*ere ablejtb renegotiate the time ( frame for tfie project, Their 
success came hot so much from the astuteness of their reasoning as from convincing the State staff that a 
specific evaluation would be in their interest as. well. After all, their agency's reputation would not be 
enhanced by a general evaluation that showed no effects. With measures of specific outcomes added, thfc 
evaluation was much more likely to supply some sort of evidence that could be used to justify the State s 
investment. Of course, Beverly and Walter's cause was also aided by the fact that they were not- asking for 
*more money or a reduced workload, just Tor a revised schedule. 

Beverly and waiter came to understand and trust each other more as they continued to work closelymnd 
taik hones'tly about' their ideas and concerns. Problems arose often, ^but most pould be handled to their 
satisfaction. And their growing resRecf for each other helped them accept the occasional* sacrifices each 
had to mafee for the other. — - 

v < - 



9 

ERIC 



95 



104 



CHAPTER 7: POLITICS ANI* SCIENCE 4*1 PREVENTION PRpGRAMING 



(What Really Goes On 



.Outside) 



Evaluation of social programing, the pri^^mjng^^ does not exist in a political vacuum. To the 
other elements defining the context of social programing— the source of funds, the organizational 
foundations of the program, the constituency created by the program, and its social setting— evaluations 
introduce their own political necessities. - - - » 

Evaluation, has always been part of , the learning process by which social' organizations profit from 
Lessons of the past and evolve into stronger* more effective institutions^ ^ ^ the strongest 

motivation of all social organizations has been self-preservation, an^ those that have survived over long 
periods of time hav^ learned their lessons well. * 

Today, it is difficult to think of evaluation simply as a natural learning process. Beginning with ^ 
Such man's classic text (1962) and building on a' historical foundation of educational, evaluation, evaluation 
t^seareh as we know it today has emerged as a new discipline, blending knowledge, of eTOriomics* operations 
reis ®srch, and aim ost every aigect of the social a nd psycholofccal sciences. COTCom i tan t "with Ihis ej/olut ion 
hav^been the wide-ranging social prbgrrfms launched by theV3reat Society legislation of the middle 1960's, 
which called for evaluation at -every level of planning an<N{jrograming., This-recent history his cfist 
evaluation into a spejcial lights sensitizing^^ and prograttK^ersonnel to the jMlitical implications of 

evaluation. It has become such a specialized dimension of social pr5|*gming that ohe can lose sight of its 
role as the basic learning which accompanies all healthy programing, %yhe_tK§r special evaluation research i: 
studies are! funded or riot (see Bittrier 1972, for further discussion of this point). ^ 



s 



volume, as well as this chapter, focuses on the interpretation of evaluation as a formal study, 
rath^f" tffiftt as a naturally occurring tool for learning. Of courser the formalevaluation study should also 
help those managing a prevention program to learn and tp make that program more effective. 

The strongest political aspect of an evaluation study *is its potential threat to the survival of the 0 

evaluated program. In times when funds for even basic social J^ryices^ducation, health care, and public, 
safety— are in short supply; the. threat to funds for recently conceived social services such as^Jrug abuse- ■» 
prevention is even greater. In a political climate when every, competing program is being, carefully 
scrutinized, negative findings in an evaluation report can endanger a program's very survival. * . * 
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But even though pr&gpirris arid the funding agencies must continue tocely on eviUuations to learn how 
.well the prevention pt^g^^s are perfqrnnng, _hejttjer the eyaluator^ nor flje programs r them selves need- be 
ffelpless victims of circumstances. The central question addressed by this chapter, therefore, is how, in an 
Lhcr^irigly changing political arid economic' context* "brie can have sound evaluation that supports the 
growtn of alcohol and drug abuse prevention programs and that helps them survive and improve father, than 
provi^p ammunition for their opponents. p 1 ) 1 . 

e approach to this issue, in harmony with the messages, of preceding chapters is .presented below. 

+ ' * \ _ * ' 

Firs tf it is important to understand in .advance the political problems associated, with the" 
evaluation of alcohol and drug abuse prevention programs. 

* : _ ' ' 1 * * * 

Second, it 'is important for the program mariager arid the evaluator to arrive at an* open, shared 

Understanding of 'their personal and professional goals for the evaluation so that it' caii.be 

accomplished/ in fm atmosphere of mutual trust; 
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o Third, it is important to develop comprehensive plan before the evaluation starts. A critical 
element of that plan concerns how the political implications of the evaluation research are to be 
addr^sed—spelling but the complementary roies^l^this regard* of the evaluator arid the program 
manager. 

b \ Fourth* throughput the evaluation the evaluator «md the manager maintain $ close working 
rJ?A?_tiWWhii>; so that they can solve,, to their mutual satisfaction, the political issues which are 
likely to arise at each stage of the evaluation.. * , * 

Finally to the extent outside the program should ' 

also be included in this process. Advanced planning— is- essential, but it can only jjo so far_L n 
" ariticipating the mariner iri_ which these political forces actually develop "around, an evaluation. 
Real effectiveness in jealing witfc these issues must aris|^rom continual interaction with external 
powers, which initial understanding and planning can do much to assure. 

jtaetherj>urgbse_of {his chapter is to show how to present evaluation data, results of which are almost 

ari^ibject failure,^ Usually," they point up s&rengths and weaknesses in a complex fabric of findings and. 
interpretatfo limitations* seen in i pfioper lightj provide 6p>pbrturiities for improvement; *and they 

strengths highlight the achievements that the program has already accomplished. ~j * 

• f -J * , _-- -_ _ _ ' _ ' 

The manager and staff of a program can be expected to examine findings which point in a^ajriety of 

^L re _ c y° ns ^^.^l^pyer t^e lessons Ltha^qan ^be _lejufted_. But persons outside of a program are le» Hkely to 

ponder i complex pattern. The netfs media especially like to have their stories etched inblack an£ wriite. 

Therefore* this chapter suggests way& in which managers arid evaluators can present complex, ambiguous 

evaluation results simply, in a manner that benefits the prograrij and satisfies the need of more remote. 

audiences. \ 

* • ' _ 

IV is assumed that Jhe evaluator has undertaken to assess program effectiveness within a framework 

that tHe program itself defines— that is, in terms of the program's goals. Ideally? the evaluator is detached, 
arid willing to give the prograiri a fair test 6i its effectiveness.. But -the tacit ^sometimes explicit) 
, Understanding is__tha^^ accept the goals as the prbgfSm defines them and* in terms of the 

underlying ^tljeory of alcohol and drug abuse prevention, will -relate those goals to the problems of the 
participants, As Carol Weiss has stated in generic terms (1975, p. 19): ; * 

First, evaluation research asks the quest'ilm How effective is the program invfneetinglts 
goals ?i Thv, it accept^ the desirability of achieving those goals; * By testing the 
^fectivenesybf the program against the goal criteria, it riot drily accepts tfife rightness of . 
the goals, it 61so tends to accept the premises undefrlyingjhe ^ 
• assumption that thii type of program strategy is a reasonable way to d^al with the problem, * • 
that there is justification for the social diagnosis arid prescription that, the program 
represents. ^ the ? 

goals, or else the study woutd .be a frittering' away of time, eneifcy, and* talent. These are 
{ political statements with astatus quo cast. 

i_\ _ _"_ ._: i : 

This initial willingness to pe£.the world as the program sees it, at least provisionally, is a'rpajor political 
stance that mo^t evaluator^ take when 'they dp an evaluation. This static^ must go even a step further; 
namely, L evaluaU»s gfjpuld be_MmmiUed Vo their w<*yc UsedJp strengthen file ^rcgram 

whenwer possible* this commitment is the^ftfundation of the mutual trust and understanding that are 
essential if evaluator and manager are to work together with external forces; to deal witfhibife many issues 
surrounding an evaluation. , ^ 4 * V. -. ■ \ 

The remainder of this*iapter is organized into five sections: . ; 

i. i- --• J ■ . ■ • 

o ^FoUr ease Studies ' ■ A ■ 

6 _< Issues Relating to Vajues^ 



Q £ Ifsges Relating tp*E^aly^pri Design _ ; *' ^ 

o Issues Relating to the Presentation % of Findings \ . ** ^ • i ^ . 

Concluding Guidelines. * : * * • ' • * 



o 



m ^ For several reasons, the chapter focuses qn outcome evaluation, with jonly occasional references \ to 
v process and impact evaluation. Most external political issues arise from outcome evaluation, print arily 
because it is the type with which non^evaluators are most familiar arid for which they J*ave the cle&rest 
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expectations. Process evaitiattbiv results are typically used within the program context* arid impact 
evaluation results have the sanie^lxternal i^litiea'ra^fieatioiii-asutediiffi results. ' 

I " FOUR CASE STUDIES 

• * • • • ' , . * it 

The issues raised later are iUustrftedher^ drawn from the evaluations of four prevention 

prc^ram^onductcd thejiuthor or his associates. Obviously,- these case studies do not reflect the full, 
scope of prevention programing, All involved programs were designed to ^ 

adolescents, and young adults^ A great deal of contemporary drug and alcohol abuse prevention programing 
v fbcusesTbn other ^ecialpopulafeons. ; .')' '>'• %; * • 

Because of the sensitive nature of the issues being discussed, the four ease studies are anonymous. AH 
identifiers have been changed, and some fictional illustrations have been added. 

; ' * i _ _ • * 

Project Commune * - s 

' -' - . -» _i - : • - 

Project. Commune was an early intervention project* providihjjindiv^ 
of group counseling* and referrals to other pro-ams for speeialize^helpi It served high school stents and 
young adults who. "Were expert meriting with drugs and^were self-mbtfyatgd or were encouraged by theft- 
families, teachers/ or friends to seek help before more serious drug use caused real ^narm, L The sett ihg was a 
suburban uhiverfity town. Lbs Verdes, Arizona, providing the program with a white, middle-class clientele. 
i The most interesting feature of the program was that it was* based \ on Maoist i^hUbsbphy find was run by a 
collective of sevenJemale managers, the "Committee's no one of whom wasofficiaUymore in charge tjjaij 
any of the others. The principal evaluatbr was a male, and both outcome and process were evaluated. 

The Ghinese^outh Club (CYC) ' • / 

the Chinese Youth Club was a storefront program ..located in the China town area of Big_City t 
California. It served a population of secondary school stu to ^5' t y 
from Hong Kong, Spilt heast Asia, and mainland China. The:program Hsed> the facilities of jieighbdring 
schools and provided tutoring, Chinese arts, sports programs, and individual and' pbup counseling t^the , 
students and their families* The students lived in ah inner^ity cofhmun^ by a eonsiderfcble 

amount of drug use, v _drug dealing,, ind gang membership on tMe part t of Chinese youth and o.thei's., -The 
program's clientele did not have a history of any drug use oh entering the program. The pfograip was • 
evaluated from both process and outcome perspectives. The program manager yias Sue and th£ ^valuator 
was Elliot, : • _ 5 

The Mexican- Amyican Youth Alliance (MAYA) : ' ~ ' 4 

k» m . ; - - - . 

MAYA was a prevention outgrowth of a cbmnijuhity-^ased heroin treatment grbgram. After ^number 
of years brprbvidihg effective treajggit of addicts in thjs JVIexiMfr-Am^ 

the community sought to prevent the development of heroin addiction by wbrkfiig WKfi swbndary_ school 
youth. They prbvided a Chicano preventibncounsfclbr in the th^ge junior liigfi schools and the one senior high, 
school that aerved this irther-<city Cfticahb community in^ 

conducted values clarification -sessions in social studies classes, provided individual ^cbunsejing* during ihe--' 
day, and conducted ia cultural club for Chicahb ^buth after school which included £pbrts, arts 'and crafts, 
outings, and group counseling. Mafia was the ffrtf^arii manager and Thomas was the evaluator. Process and 
outcome evaluations were undertaken. * * 

i • \*\\ ' •* 

_ , • - * . -W - ' '■ 

The New Lofe^ehbol ' \ .* ' 

The New Life Sbhbbl served Saddle "Cfef ek* New J*sey, a large bedroom community of a major eJBtern , 
city. fcikeWqject j^mrnune^ jt was an earl^ intervention program, helping secoijdary school youth who^had 
begun to ^experiment with drug use. It provided an alternative school setting, which was strictly enforced as 
drug fcee^ and in which students could [ reestablish thefr ^mmitment tb _^ing we 

Qounseling groups for parents. The clientele ^ere black jirid white middle^claas students, they spent a year 
- away 'from home in this specialized sctebol prevent limited ejcperimefitatibn with drugs and alcohol from 
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blossoming into a full-blown drug-oriented lifestyle. The school was evaluated with bbtlv process and 
outcome evaluations. The program manager wps Sharon, and the evaluator was flichael.l 



ISSUES RELATING TO VALUES 



The Evaluator Has Values 



_ Although, most evaliiatprs, strive tb. be objective* they inevitably bring theii* own values into the 
evaluation..^ to know their own values and, 

therefore, cannot take them into account in efforts to be objective. 

Managers must know the evaluators' values and be able to discuss them openly, and frankly., _ Often 
eyahjators* feel some cultural djstance between t program, and i^ seUing, even if they are 

from the same culture. For example, The New:Life School seryes a middle class suburban community on the 
east coast, .and Michael— the evalUatbr^-grew up In a suburban middle class community irf the midwest. Not 
only are the two com munit ies U£e<^raj>hic^ but also ^oiith culture has undergone ja _ draffiat \c • 

transformation in 20 years. In-additionj because the program manager ^ ari*J staff averaged about 10 years 
younger than Michael, he Felt but^f-tuhe to some degree with" the styf f, and even more sq' with the 
students. ; ^ * • 'i? 

The cultural distance becomes much greater when the manager, the staff, and the clients comS.from a 
cultural- background distinctly dif fefejlt frojn that of the evaluation team- Consider tfie Chinese Jfouth Club 
in which all staff and clients were recent immigrants to the United States— all within the past 12 years, / 
many having been in the United States less thaaa year.: The evaluator, Elliott, oh the other hahd,_grew up in . 
i small, rural University v town in Northern California. His family background was white and middle Cla'ss, as " ; 
was most of his hometown. . > V 



Most of the CYC staff and about one-third of the students came from Song Kong. _• JUfitil the: 
normalizatiqn^of relations with China an?! the liftings of immigration restrictions, the majority of; the 
Chinese immigrants to Big City came from Hong Kongfc_ But since the political shift, nearly three-quarters 
of the immigration to Big City is from i the mainland. 1 The Hong Kong Chinese speak English, ^ell and gre 
comfortable dealing wUh oecjdentals. In eont^ mainland immigrants^ usually have go knowledge of 
English and are more timid with occidentals, at least until they become familiar with the language and the 
-culture. r 

.: • l_ i . . - . 

Through his upbringing and his own tastes, Elliot hdd developed an affinity for Chinese culture and, 
therefore, felt com fortable workihg with Sue and her staff. He probably would not have felt as comfortable 
had the "pro^^^ Gljinese from \ mainland. Aj* a result, he was inclined to be favorable 

towards the progranj, a bias thqt was nonthteatening to Sue and the CYC. % 

i__lPn the other h^d, .^iot/s re^arch^a^ Phirtesi working on his 

doctorate at Bife City UiiiVersity.\He was inclined to be critical of the way the GY,e operated, and yvould 
have liked a more professional staff, with Advanced. degrees in counseling or education. Although Elliot 
^cognized these feelings in R<^ did not feel that Jieknew him i W%H enough ^ 

the CYC staff seemed confident that the tone of the final report would be in Elliot's hands, and that he 
w\uld filter out excessive negativism on Robert's part. ♦ ■ 

^ - i \_,L,__.« - ' > . • ' 

A program witjtffe strong political orientation cannot ordinarily find an evaluator with a shared outlook; 
it-can, therefore* ^expect- to feel sbme discbm ifbrt with almost any evaluator. • - 

. * __ :_j A " ' : • • _ _. _-__-r__._ _ ; _T_ _ _' •_ 

Mutual openness is important with respect to this first issue, fn instances where the managed selects or 
participates in the selectiaH^bf the i evaluator, the manager shbuld r^uegrtfiact the evaluator identify those 
values r^tevant to the evaluation, especially any that relate to the pyG&am's goals, methods, and cultural 
background. If a candidate evaluator seems unwilling to be fr$nk, seems uncommunicative, or expresses 
values that make the manager uncomfortable, rejecting the candidate might be wise. 

. \ - ' - • ... _ '.i: . . • ' * * 

•fime and resources probably do hot permit an exploration of the values of all members of^an evaluation 
team. NbrmaUy, however, because the principal ^valuator will-have the greatest impact on the evaluation 
and brftlje manner in which results are presented, understanding that person ! s-*alues is normally sufficient. 

One actual instance illustrates how disastrous the consequences can be of bailing to recognize a bias. 
■Two principal investigators were awarded a grant to evaluate a .national, multi-Site program^for juvenile 

l - • 99 "■ 108 - 7 J 
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delinquents. These irivestigatbrarhe^d strong personal theories of delinquency and privately expressed their 
t hope that these programs would turn out to be failures, they therefore undertook this evaluation to "prove" 

^ ' the programs ineffective. The results confirmed their expectations; the published outcome was exactly 

what they had wanted. / 

The phenomenon of researchers* finding vthfyMe9 fl re looking for is not, always so blatant. Even when 
evaluatprs Jiave only a latent belief about how thirfgs should turn out, the results will quite likely support the 
validity of that belief. Citing excellent psychological research demonstrating the frequency of this 
phenomenon, Martin Orne has labeled it the "demand variable* 1 (Orn# *9B2; Orne and Evans 1965). To the 
extent that managers can control the situation, they must ensure that no : "deman(J variable" exists to cloud 
the evaluation results. ' '■- — - 

■••'.."_» - / 

And the Program Has^ Values Too 

Of course, an effective collaborative relationship requires openness on the part of the manager as 'well 
as the evalUatbr, althbughithe twb parties need not share the same or similar values. What is necessary is 
that they understand each other's values and that the values of neither party work against a reasonable 
evaluation. OflWh the evaluator and the manager have strikingly different values, but both parties have 
Agreed to respect their differences as best they can* 

Project Commune provides a striking illustration. In this rare instance a dru^ 
founded on a Maoist feminist philosophy was funded by a State criminal justice planning agency. The ^rant 
required that the program secure an objective outside evaluation. The seven managers approached a friend 
at a local university, who helped them find an evaluator, George, who then hired a small staff and designed a 
process a<nd outcome evaluation study for Project Commune- 
It is inherently problematic to deal with more than one manager. In this case there were seven, all 
nominally equal to each bther-a structure which George had to respect. However^the situation was made 
somewhat easier because the managers' deeply, held extreme political views were remarkably similar, 
obviating *nuch of the internal value conflicts which might ordinarily have been expected. 

George was at the time a rather liberal Democrat^ but from the perspective of a Mapistj his position 
was not much different from an extreme right-wing Republican. So from the start, all accepted the gulf 
separating their outlooks an<j values. To work together, they negotiated a compromise around the 
distinction between process and outcome evaluation. The process evaluator would, of necessity, have to get 
close to the program, whereas the persons jybllectirig the outcome data needed t^maintain ther 
and did not need to "infiltrate" the program. George, in conjunction with the Committee, selected a woman 
graduate student in sociology at the local university to work half-time as the process evaluator since only 
another woman could probably h&ve secured'the trust of the Committee and the staff. Although not a 
radical, the woman had strong liberal views* and was regarded 0y the Committee as co^optable. In. fact, to 
some extent, she was co-opted as the study progressed, casting some doubt on her objectivity. However, 
given the political nature of this program, the selection of a woman may have been a necessary condition for 
process evaluation data to have been collected at all- 

this illustration provides a clear example of how ah evaluatbr arid a group of .managers sqlve^ a ; 
difficult situation of dissimilar value orientations and we evaluation; 
Mutual respect for each other's values,' formed during an initial collaboration, made it possible for the two 
. parties to work together throughout the evaluation. In general, the degree to which the evaluator and 
manager can understand and respect each other's Values, the more likely ^ they __are ™?A^ i ^___ T P_ , ?_ t _?*?L t _ r ^? t 
throughout the evaluation. MutuaL trust is essential for working through ^thorny political problems that 
typically beset the presentation of evaluation findings for a program in the public eye. Thus, establishing 
reciprocal understanding and trust is a critical first step in dealing with the politics of evaluation; 

____ __ _' *^ / ■ _ 

The Community arid the Political Leadership^ftay S ^tetehing " . 

Prevention programs operate in a context of community value/3, of significant bureaucrats, and of 
political leaders. This larger, external context is usually foremost in people's minds when they think about 
the politics of evaluation. Tile values internal to the program and to ttiq evaluation, interact with these 
external values in the resolution of the evaluation's Political issues. ~^ 

The evaluation of the MAYA prevention program illustrates isstfes associated with a concerned 
community. In this instance, the Chicarib community* with serious heroin, addiction problems, had been 
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neglected by city agencies. A politically aware and creative group of young men and women conceived the 
idea of getting a grant to set up a heroin treatment program. They were successful, and the MAYA program 
~~ c _ arne _J n t° ^ ?°undjers, J 1 9w® v f i j T were not good .administrators, and the ^requirements of JA® SJate 

funding agency forced them to hire a professional administrator, Maria; who came from a Chicano .drug 
ablise treatment program in Big Citjfc California. Soon after her arrival at MAYA* Maria applied for a 
prevention grant. * 

% ..The community was uneasy. It did not want to relinquish control of program administration to 6 
profej^ibnal and an outsider. The second grant, the prevention grant t.alro affected the. operation _of the 
agency, including the requirement to let a substantial contract for an evaluation. In time, community 
members on the board^of directors were replaced by members from some of the agencies that MAYA dealt 
witjh, .including a jjfputy superintendent of schools* a probation officer, and a member of the sheriffs 
department, all of^hem Anglos. Gradually, Maria felt^constrained to act as a bridge between two cultures 
with little mutual /understanding— the local Ghicahb community and the Anglo, middle-class bureaucoacy 
that f^oyjded the funding. In many instances, it seemed as though actions that pleased one constituency 
only uptfct and confused the other. B <% : 

-------- f 

Thomas, the evaluator, felt at once beset by tHis strain and mistrust when he arrived to evaluate the 

MAYA prevention project. To make jnatters ^ wors^ because ^ distrust, of Mar ia's commitment _tp 

evaluation, the State funding agency had specifically selected Thomas as an evaluator. But Thomas and his 
_ staff _were Anglos, only one of whom had experience dealing with Chicahos and could speak a little of the 
* local"Spffntsh dialect. 

On the positive ?ide, Thomas and Maria soon realized that his presence and Anglo background could help 
give the prevention ^ component of MAYA credibility with the Anglo fundihg sbUrce. The_ co^munityj 
however, was anxious that the Anglo influence and the professional character of Maria, her staff, and half of 
the board of directors not undermine MAYA's focus oh Ghicahb concerns, values, and culture. These were 
the shared concerns of Maria and Thomas as they mapped but the evaluation. 

Whereas the MAYA program, needed to work within the concerns.bf the local TOmmunity^ the, New_ Life 
School focused on the politics of the school system and the board of education. The New Life School had 
been founded— over the superintendent's objection jthat the school system was doing all that was required— 
because of the personal commitment of two board members* * Once established it also had strong supports 
from the Assistant Superintendent for Alternative Schools* under whose authority the program fell. 



The evaluation was planned and undertakejiiy the Division for Program Assessment, who hired Michael, 
ah outsider evaluator^ to evaluate the prevention school.. Michael arid his staff were hired by a competitive 
procurement _?onbucted py the division. The New Life School had^ been underway for a year when the 
superintendent's office decided to have it evaluated, with the expectation that the findings would be 
available to the board of education in time to consider the school's refunding. j H 

Michael first encountered Sh^ron,%the raanager^nd the principal of the New* Life School at a meeting in 
the office of the Assistant Superintendent for Alternative Schools (Sharon's boss and mentor). The meeting 
also included the director of the Dwisiw^ conflict 
between program administration and evaluation. At the time of this first meeting, Michael was fairly new 
to the scene and only slightly aware of the political history of the school. He did feel that the meeting was 
strained, but could not immediately understand the source of the conflicts — 



After a little invest igatibh^ and development of a closer cb^ Michael began to 

?9 r t_ _?ut the nature of jthe poiitical pressures. It _seemed cle^ that the 'iro^chool farty'' consist ing of 
several board members and the assistant superintendent, were looking, for a favorable evaluation.: The staff 
of the Division of Program Assessment were neutral* and wanted only to see the evaluation carried but 
proTessjonaU^^ J^^ociatea^wer^pr^ably .slightly host to the 

program because of the manner in which the board had pushed it on them, their negative feelings did not 
seem very strong* and they were willing to support the program if the board continued to want it. 

The case of the New Life School is typical of many instances in which a prevention program has drawn 
considerable attention to itself at the time of its founding, resulting in some polarization of key political 
forces. At the same time, most j)olitical situations are complex. It is often most J5lej*r_who the committed 
supporters are. Other key actors, often neither for nor against the program, may be somewhat threatening 
because they cannot be relied on to support the program if findings are not favorable. Usually there is also 
5_ third_ camp, which c.bhtinues tg » bear. a grudge against the prpgram._ These individuals dp npt_ necessarily 
lean on the evaluation for negative conclusions, but they would pnobably be pleased at such an outcome. 
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Such forces, need to be understood arid, sorted but before an evaluation can be undertaken since they-awiti 
come irit^plaj wheri a report is released. ^ _ >r 

.. . { • * 

The World of Macro-Politics : 

Macro-politics may affect any social field, but at times changes a t_.this jevel Jsre^eeittU^radiCHk 
The budget cuts for social programing now in effect could alter the very structure of prevention programing, 
Major support responsibility has now devolved upon the States, a tew of which are^rijbyirig exceptional 
wealth because of fuel severance taxes while rra>st are facing serious fiscal problems. . The J'e^lting picture, 
especially: in the poorer States* is one in whidi drug abuse prevention prqg^ 

Federal* State,_arid lbcal_ (ax dollars with a wide range of health programs, most of which have strfeng 
medical and consumer constituencies^ In such a climate, prevention programs need extraordinary «ujg>ort^to 
maintain and expand tHeir funding baS ei-^H istbrv has phown over the past two decades that favorable 
evaluation results are seldom, if ever, a ^deciding factor in suctfi debates. 'But favorable evaluation results 
can be added to other kinds of supporting information to build a more compelling case for the_ continued 
support of prevention programing. In this context* sensitivity to the larger political picture takes on an 
unusual degree of importance for evaluations. 



ISSIH5S RELATING TO EVALUATION DESIGN 



Specific versus Generic Prevention 

Anyone in the prevention field comes to realize that the categofial [boundaries by which Government 
agencies address the* world of education^ health, and Human services often make }t difficult to encompass 
~ real world problems. Prevention of alcohol and drug abuse provides an especially poignant example of how 
^the "official" versions of the world differ dramatically frog the experience of programs dealing with 



prevention "bri the street." 



Preventing behavjors destructive to the individual's hearth and well-being, and potentially destructive to. 
others, of which drug abuse prevention is just _brie_ aspect^ is by its nature a unified generic probfcm. 
Evidence from a number of research studies suggests that among adolescents, alcohol and other drugabuse 
afe associated with each other and with delinquency, teenage pregnancy, problems of family life, and poor 
school performance (Jessor 1979). Problems demanding prevention initiatives are found-among young^adults, 
the middle-aged, arid senior citizens, each with their own peculiar generic mix. A look at the Federal 
bureaucracy reveals that intrinsically related prevention activities have been funded b£ the Alcohol, Drug 
Abuse, and Mental Health Administration (ADAMHA); by other agencies of the Department ^Health and 
Human Services (DHHS) concerned with aging; by the Department of Justice; and by the Department of 
Education, Several other Federal and State offices, agencies, and institutions have funded research and 
demonstration projects relating to brie aspect of prevention or another. 

In this context, local programs hav% at times shifted their emphasis from b^ 
to another, shifting, for example, from drug abuse to delinquency prevention and doing a credible job of 
*bbth. Some progress has been made linking prevention efforts involving drug and alcohol abuse at tlie 
Federal, State ana local ieveis. 

Program managers generally recognize that their prevention etfbrtSj in midst iristarices, have generic 
impacts broader than alcohol and drug abuse, prevention alone. Program effects across the range of 
destructive behavior depend bri the nature of the prevention modality and the risks associated . with a 
particular population being served. In addition to drug abuse in our four case studies, the risks of 
destructive behavior include alcohol abuse, delinquency, arid failure in school. 

The model of drug abuse onset and other destructive behavio rs proposed by the Jessbrs (see* _for 

example, Jessor 1975; Jessor and Jessor 1975ay Jessor and Jessor 1975b)"sUggests that changes in destructive- 
behavior form a predictable pattern. Thus, a genuine change iri an adolescent's lifestyle away from drug 
abuse would probably be aecbmpanied by changes in other aspects of life such as school attendance, 
academic performance, and the tendency to commit delinquent acts and status offenses and other disruptive 
behavior. This model, therefbre K jUstifies a program's efforts to correct behavior more generically, rather 
than to focus simply bri drug arid alcohol abuse L For certain preventive strategies, thereforei it may be 
important to collect clusters of appropriate prevention outcome data to understand the degree that 
prevention efforts result in broadly based life changes. 
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jri three of bur four cases, additional data were collected on delinquent and acting out behavior (CYC, 
WAYAj and the New Life School)- In two of these cases (CYC and New Life}* information was collected on 
?_ s P_ir atio ^ djmehsTqh of the Jessbr model); and for the New tife Schpbli detailed 

information was also collected jon school attendance and academic performance; In all instances the kinds 
of clustering of outcomes that brie would expect from the Jessor's model were rioted. 



For a program with high public visibility; the^ollection of a wide range of outcornes may be advisable. 
The ability to demonstrate butconi;es.iri*a number of areas of public concern may be helpful in developing a 
broad-based constituency^^ for future funding. The ! select ion of outcome measures 

may have significant political overtones and should be a collaborative effort of the evalimtor, the program 
manager, and other decisionmakers. ;* v . 



Control £)ver the Evaluation Report - . 

. ' *_" ■ _ . . _ ' _ v _ _ _ ' - _ , v > _ . . _ 

Evaluators^ Jn general, are rewarded^^ having. their work read, used, and appreciated. A 

spectre that hangs over the evaluation field is that the commissioning agency "light suppress the re 
prevent the evaluator from making the findings public, Such suppression may be reinforced by highly 
restrictive language in the evaluation study contract which gives the contracting agency complete control 
over the findings and any reports produced. ^However, once word [gets out that ^an l^eney has exe^rcised such 
authoritarian control over a report, it may be- difficult for them to contract with reputable evaluators in tfie 
future. : 4 . 

Uriderstaridably, of course, managers are concerned that an evaluation report wiH contain material that 
in their view is totally misleading or erroneous, arid that they will riot have an opportunity to detect such 
problems before the final version of the report ispublished& Or, even if managers do see a draft, Lth6y|wbrry 
that evaluators will cling stubbornly to erroneous views, and that needlessly damaging or misleading reports 
will see the light of day, without any oppbrturiity for the manager to express a dissenting opinion. 



This problem can be avoided if, at the design stage, the evaluator and manager work out a mutually 
acceptable set of guidelines to govern the preparation arid issuing of publications. Following i$ an example 
of the way such guidelines might be drawn up. 

6 THe evaluator agrees to show the manager a final draft of any reports or articles which are to be 
published concerning the study to^0fow the manager to review and com merit. 

b The program manager agrees to review and comment on any draft materials in a timely manner 
arid to comment frankly bri the draft. 

*■ 

6 The evaluator agrees to cbrisider carefully the manager's comments and criticisms, to 'make 
appropriate changes in the text of the draft, arid to Show these changes to the mariager. 

6 If the rriariager continues to have serious reservations about the contents of the draft, even after 
ail the. changes which the .eyaJuaAbr js_wUJih^_to make have been made, ttiese dissenting .opinions 
may appear as an addendum to the report. If the material is to _b^publisljed^in a joy rnal or book 
form, where there* is. a serious concern that misrepresentations may damage the program, the 
— mi^geMhbuld4lavc-the rfeht to im»H*hHmeh'ym^ j 



Guidelines like these assure the evaluAtor of a right to present findings in all appropriate channels and 
assure the manager of means to protect tire program's interests. Even when the program arid the evaluator 
are on harmonious terms, as was the case with the CYC evaluation, such guidelines are best expressed 
formally. * . 

The Selection of Goals to be Measured 



Another major concern is whether the stated goals _of ^ thejsrogra^m^r^ the goals actually pursued. The 
author bribe participated in an evaluation of a drug abuse treatment program fn which the published goal was. 
to help adolescents §hd young adults stop using drugs. Sbbri after beginning the evalilatibri,*he was amazed 
to find himself sitting in on an employment interview in which a candidate for _a staff position was being 
rejected in part because she did not take enough drugs; ' The actual goal of this program turned out to be to 
l^jtimize what the program regarded as appropriate drug use behavior in that cbrrirriuriity* Any evaluation 
which had judged the program in terms of its stated treatment goal would have been completely out of tune 

ERIC " 



with reality. * The program wbuld^iie appeared a failure to external powers and the manager and staff 
would have foundithe evaluation -t«t611y irrelevant* 

This issue also arose with respect to both the outcome and the process evaluations [in i the case of the 
New Life School. In the outcome evaluation, the program's stated goal was to help secondary school 
students stop using the drugs with which they were experimenting. .In her review of the draft evaluation 
report, Sharon, the manager, stressed that the program goal was to e^ 

day in q drugfree environment, rather than to try to stop their drug use in nonschool hours. This change in 
the program goal had apparently occurred sometime between .the proposal to the school board and inception 
of the evaluation study. The outcome evaluation had measured a goal that no longer applied to the program. 
Much effort could have been saved had the evaluator and the manager fully discussed the program's goals 
arid objectives during the design of the evaluation; , 

Michael, the evaluator, partly at the request of the Director of Program Assessment, had focused a 
major share of the process evaluation data collection on assessment of the counseling component at New 
Life Schtf&l. Me later discovered that Sharon and her staff were not professional counselors i a rid [did not 
regard counseling as a primary component of the program. They were teachers and had concentrated on 
those elements they could best deal w^h, such as discipline, attendance, and academic performance. 

Obviously, Michael could have been more efficient had he carefully reviewed his plans with the funding 
agency and Sharoh before going ahead with the evaluation. Instead, his priorities were set by the funding 
agency representative, who wanted the New Life School evaluated in terms of its published objectives. The 
situation would also have been helped had Sharon reissued the statement of objectives, so that the school 
administrators responsible for the evaluatibn could understand the intent of the program. 

Are the Tools of the Evaluation Appropriate? 

Another technical concern with important political implications is the relationship between the 
evaluation methodology and the objectives of the program; In the evaluation field, certain focal areas have 
received the most attention in terms of measurement, instrumentation, and analysis, Three factors combine 
to create a dilemma in the measurement of program goals and, therefore, in the ability of the program to be 
evaluated: * ^ 

q Some existing instrumentation does not cover all variables of interest. 
<K Some existing instrumentation may have debatable validity or reliability. 

b ^ Rarely are evaluation resources sufficient to develop and refine new-instruments based bri unique 
program goals. '"' i 

The evaluator may 'have -to select an instrument that does not correspond exactly to program goals. This 
problem arose in every one of the four case studies examined in this chapter, and in two instances it had 
serious political rarrtificatiofls. 

_ . _*: _ > . 

In the CYC, a focal objective of the program was to work with the immigrant parents to help them 
understand their neighborhood street conditions. The Chinese parents lived in an insular world; they knew 
almost no English, could communicate only with other Chinese adults, and spent most of their waking hburs 
•working in factories and restaurants. ■■ 

The evaluator could %>t' locate an instrument that would assess changes the progra/m tried to produce in 
parent knowledge, attituAs, and behaviors regarding child-rearing practices. The manager pressed this 
point because it was such a central goal of the i CYC ^program. The failure of the evaluation study to 
document achievements with the parents undermined the, credibility of the program with the head <tf the 
State funding agency, 

In the case of New Life School, the main goal was to maintain a drug-free environment during schoo^ 
hours.* Unfortunately, the evaluator was unaware of any instrument which mMsured_the prevalence 
use during «a Specified portion of the day, so that no attempt was Jtmtteto evaluate this particular objective; 
Overall prevalence of drug use was assessed using a standard instrument. But the inability of the evaluation 
study to focus specifically on the central goal of the New Life School had a consequence— the ^ manager felt 
acute political repercussions when the evaluation could riot "prove" attainment of a major objective; 

t* * 

The manager must understand that only rarely will an outcome evaluation provide existing instrumenta- 
tion tailored to the program. Therefore, managers and evaluators must assess in advance which goals and 
objectives the available instruments will measure accurately and which they will measure poorly or not at 
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r - ilk AiUicifiating tbisVlmbalaTice, ;they should design the overall evaluation to minimize negative political 
. implications bV.cdmjn^nicating evaluation constraints tO;^tefh$l;decisibhmakers and negotiating mutually 
acceptable evfclu^^^ ; ; - v 

J*i™**?_ r >P p i^te?n3 : fir«^ impact evaluations. A problem for process 

evaluations-^ that adekfuate nrrethpds are seldom available for recording the "substance of the prevention 
^modality as- it is actually lmp^mehted. The political implications of instrumentation problems are usually 
. not &s_ far-reachi^J^r j)roqe^ evaluations because' public administrators arid the com m unity 

have much less •ekpjsrienc^ with these. • :~ 



r ^ v, > ; . ISSUES RELATING TO THE PRESENTATION OF FINDINGS 

' t; ~\V v ^ " " - > ... - - - • 

Throughout the flrl<?edirig discussion on polUjcs and ev^ucrtj^ final report has 

rece1ved__emphasis, even though the issues concerned mostly predesign alid design phases of the evaluation 
pu<jy. Usually, the politically sensitive issues of prey eritibn i program irig dp nqt <*bme into play until the 
^.PJ?-- ^r^j-^P?^^^??- outside ^ pf 6gra m _ confines. \ lit such as' bur four cases, this 

usually occurs ^after the study is completed and the final repeat is prepared. Larger, longer term studies 
/may report findings frbm^Ume ta time throughout the course of the evaluation. ; 
* \ '* "* "j 

_ If the recommended planning, occurs, and if the evaluator and .manager have developed a collaborative 
relationship, then ^^tron^ foundation is laid for dealing with any political issues that arise when findings are 
presented to the com ffiiJhity and to concern^ public administrators. , ~i 

The Need for a Posit iv^ Approach 

V * * 

Evaluation results ate Almost always ambiguous. (See Weiss 1975 for a fuller discussion of this point 
fro^l J^ P^spectiye of ^e evaluation of ^ In factj evaluation results were 

somewhat ambiguous for our four case studies, as evidenced by one aspect from each: 

o Project Cbmmun^ decrease in drug use among participants who stuck with the 

program; however ^nt&ny of tho^e who Entered the program left lohg before they had completed it. 
Those who left early showed rib change in drug use. 

o CYC gave a similar picture. Recently arrived immigrant youth, especially boys, tended to begin 
experimenting with drugs And other forms of acting-out behavior. If they Were regular CYC 
attendees, this experimentation was shbrt-livedj arid they continued to be essentially drug free. 
If, however, they left the program at or before thi^ point, they- sometimes adopted a destructive 
lifestyle, biased orr association with Chinese jgjreet gangs who both used Snd sold drugs— a pattern 
common for both boys arid girls. 

b The MAYA program definitely helped boys reduce acting-out beMvior. Mccwever, Chicano 
teeriage_d_ girls in Central City were 1f pyer controlled." The impact of such experiences as values 
clarification v/as to encourage the girls to ^ct out more, including more experi [mentation with 
drugs— although their overall level of experimenting and of acting out was less than that of the 
boys, both before arid after the program. Comparison group girls acted out less and took fewer 
drugs than did program girls; whereas comparison l group boys acted out considerably more and 
were considerably more likely to use drugs than were program boys. 

■ . |J : 

o The New Lite 1 School finding was that program youth— based on a number of sources of evidence 
but not strictly on outcome data— did experience a drugfree school day. The attendance record 
and the quality of the school work for the program i students was considerably tetter than those for, 
the comparison group stude^js^ But the^ was 
unchanged throughout the program year for both program and comparison groupYstudents. " 

In all four instances, the program could.be judged to make LlDlPPrtant ^htribLtifen to drug abuse * 
prevention. However, these findings could also be presented to emphasize the aspect and to make each of 3 
trile_Pl*PgrarTis appear a failure. Note that in each case we are considering only brie, central ambiguity; other 
findings showed similar patterns, making a more complex tapestry than we can deal with here. 

each study, the evaluator was committed to a positive approach, tryirig tb help the program i build bri 1 
its accomplishments and improve its programing. In two of the four cases, CYC and New Life Schosl, the 

j 105 

114 

o 



ERIC 



program was able and willing to take advantage of ; the ^ n^atiye findings^ 

corrections in program strategy. _ However, Reject Commune and MAYA became entingled in problems 
with their communities sufficiently serious to produce the demise of both programs. They never had the 
opportunity to try to correct def iciencles in their program strategies. ^ 

In both instances, the process evaluation tried to place the problems with the community in perspective 
to help the program understand and deal with them. Project 'Commune's managers did not take the written 
observations of the evaluation seriously, perhaps because of the lack of trust between the evaluator and the 
seven radical managers, growing out of 'their ideological gulf. MAYA's community problems were 1 so far 
advanced by the time the evaluation was underway that a solution to the problem was probably no longer 
possible.^ . / * 

If possible, managers should select evaluators with commitment ^ 
findings* Evaluators who approach their work primar# as "ju^es" and who classify programs into only two 
categories^uccesses arid failures—are out of tune with the'ambiguous character of most evaluation results. 
When such evaluators bring with them a generally negative outlook, they can be quite destructive and should 
be avoided. ■ t • \ 

th e Presentation of Findings 

: _7*__ > - - - *. " - - 

Even if the e valuator and the manager are prepared to deal with ambiguous f indings jfitettkfclly and to 
make them a point of departure for <^istructiye change, pje^ritation of ambiguous results to 
source, to concerned public admin^isirators, and to the community is still difficult/ In all four cases* some 
community groups were interested in -the findings; and in two of these the interest ever i attracted media 
attention. In three of the four cases a State^Level funding agency was interested in the effectiveness of _the 
program. In the fourth case L New^tife School, there was. an mterested local funding source. In aU four 
cases the evaluation results could affect the currant funding agency's decision to continue program support. 
Finally, with respect to all four cases* other important public administrators were potentially interested in 
obtaining the evaluation findings. ^ 

1 One approach was tried in each case study to help clarify _ evaluation findings and enhanp^ 
potential for use by gbctefflal forces.. Sum paries £nd presentations were prepared that minimized the 
complexity of the findings and presented them constructively. The case summaries Were proMctive, while 
the two kinds of presentations— to funding agencies and to public ^ bdd|es— were Reactive. It is always 
desirable for the manager and evaluatbr to chart a more proactive campaign to. disseminate findings. 

• • " ■ . \ 

Responding To Audierice^Creatively 

The evaluator and the manager must be sensit ive to the breadth anjJ character of the issues of Concern 
to a potential audience and to stress these IssU^L in their presentations, evten if those issues were less 
critical when the evaluation was originally designed. ^For example, a prevention evaluation started several 
years ago and only now about to present findings may not have paid much attention to cost-benefit issues. 
But recent dramatic reductions in Federal support to health and human services have made cost-benefit 
arguments crucial. ^hanging circumstances may require organizing even data collected for other purposes 
to make as compelling a case as possible. Managers arid evaluators need to have Considerable flexibility. 

Some other ways to present evaluation findings in their broader coritekt are tot 

6 discuss the community's prevention service needs and the program's overall c^n^ributioris to 
meeting them ■_ ' _ * 

o present the findings to illustrate the human pathos of the pr<^r|Lm_?ontext 

o capture the enthusiasm that pwticipants, their families, and interested community members may 
spontaneously express toward the program. _ _____ .. ; r [_;_._. . / _ • 

Written reports, even concise geri^ communicate program 

accomplishments to members of the general commtinity while creative use of other media can help reach a 
broad audience. ; . : * . * . % , 

CYC provides an illustration of the innovative use of media for reaching the community, the agency 
rented the elementary school auditorium across thestreet for a Sunday afternoon meeting. The choice of 
•time was critical, bedause a , large percentage of adult men in the community worked in restaurants 
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e venings*. and l_mahy women worked in garment factories on Saturdays. Sundays were the only days during 

which both men and women were available for such a meeting. ' 

* ■ - * 

The immigrant Chinese adults were too tir^d from working 60^ 70, and mgre hbUrsa weekt_ to ivarit to 
attend a meeting about. CYC; but it was important^ given the politics of Chinatown in Big City, td obtain the 
interest find support of the community. The manager hit on the idea of showing a popular Chinese movie 
free to the persons who attended the Sunday afternoon program. The resulting meeting was a total success. 
About 300 adults from thecommunity attended. They: saw the first half of the movie. Then during a break 
th^ manager arid her staff presented some of the evaluation highlights in a manner interesting to the 
community. The ^valuator was introdu^c^ed to the audience, although he did not make a presentation, because 
he did not speak Cantonese. After the half-hour of CY.C presentations, the remjringer the film was 
shown; .Afterward, refreshments were Served in the school cafeteria. During the refiroshmint period the 
man *E er and s.taff_ mingled with^ with them. As a final attraction, 

participants 1 paintings, calligraphy, and other arts and crafts were exhibited in the foyer. \ 

Subsequent feedback in^i^ had made a strong positive impression. The 

resulting support, filtered^ttirbugh the active Chinatown grapevine and was helpful in suppressing opposition 
from competing pirograms^that regarded CYC as a threat to their sources of funding. CYC illustrates how 
the presentation of evaluation Tindings can involve creative, sensitive approaches. 

a ■ ■' '•" 

Dealing with the News Me dia : 

In some instances, the program is the focus of media attention whether it wants it or notx New Life 
School, MAY t A, and Project Commune were all sought out by the newspapers and radio and television news> 
reporters^ The C Ye program, however, wished to obtain favorable coverage tor itself, and sought but news 
coverage in the local Chinatown newspaper and the Chinese radio station in Big City. ' 

Whether coatacts with news media are reactive or proactive, .keep in mind the following two 
considerations and deal with the me<iia appropriately. 1 

First of ail, remember that the news media sieze upon drug abuse dal a. .Newspaper editors like to build 
their headlines around such material. Almost invariably some information regarding the prevalence and 
In^^nce of drug use (and possibly of delinquency or other kinds of desir ictive behaviors) will appear in the 
report of an outcome evaluation. The media tends to blow this informatk h but of prbpbrtibhj distorting the 
real meaning of the findings. ^~ J 

To co Un t er this tendency, the ^valuator must deyelop^proa^5hes that LPlay dbwh such statistics or their 
uniqueness. He might mention, for example, that" such levels of drug use are typical for adolescents in the 
..Jllc jm^rtant ^ things is to anticipate a focus on drug use data^ and to prepare responses designed to 
refoctts attention by helping news people place the matter in perspective. 

_ secbnd concern when dealing with the press, radio, and TV is the medial tendency to^refer simple, 

e l ther r? r _.l^.?y _?.h?P..^ e * sto ^ PP in the course of a fivfRo- 
ten minute telephone .conversation. This almost always results in serious oversimplification of the findings, 
often to the detrimeht'of the program. 

The manager and the evalua tor should hot N atiow themselves _to be trapped in this no-win situation. If 
reporters seek ^ infqrmat ion about the evaluation and/or about the program, they, should insist on a face-to- 
face meeting in which the reporters are willing to commit at least 30 minutes of their time to talking about 
the program. If they have serious professional intentions, the reporters will probably agree. If ,not, it ^ is [safe 
to assume that the potential story would hot have been very helpful in presenting the program to the public. 

Assume that the media will be interested. Even if sufch interest seems unljftely at the time the 
_M?i^l!5y? J? J^ihE JJey^PPeA. unfor eseei^ircu fflstances can arise that draw the atten tion of the media, pnd 
P^V the m . ana ? eP ? nd the eyaiuator on the spot. For exa riiple, MAYA did npjt expect media coverage. 
Central City had ho Chicanb-briehted news media, and Chicano programs seldom attracted the attention of 
the Anglo-^m mated news media. Near the end of the evaluation, However, a murder occurred in the 4t 
Chicano cbmmunityr--an organized crime assassination— and the manager was inadvertently connected with • 
the event. Suddenly MAYA was briefly in the news, the manager and evalua tor were both sensitive to the 
program problems that such coverage entailed. . Although they had not planned how they would deal with 
news ' re ^ pteps » they held, a meeting and coordinated approach was effective, 

arid they received in-depth favorable coverage from Central City's two newspapers, from a major television 
Station, and from ah important radio station. i 
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CONCLUDING GUIDELINES 



Four conclusions summarize the major points in this chapter arid organize them into broad guidelines to 
help the evaluator and manager deal with evaluation politics: 

6 Political issues cto subject the evaluation team and the program to considerable pressure, 
especially when the evaluation findings become public. To cpunter these .pressures, the evaluator 
and the manager must develop a strong collaborative relationship based on trust, respect, and 
understanding. Such a relationship arises from an open sharing of relevant values, and a joint 
exploration of thelarger context of values in. which the evaluation program is embedded. - 

6 * Evaluations tend to foci* on the stated objectives of a program, using tools which are available to 
the evaluator. An effective evaluation, which will both strengthen the program i and sustain it 
through political storms, is based on a sound design developed collaboratively by the evaluator and 
the manager; both Bartjesjii^ of * ne methods selected and their 

. relationship to the program's Ejoais arid objectives. , 7 , ' 

b Effective evaluation requires appropriate communication of findings to all interested barties* 
including the program, the funding sourcW (Soncerned public administrators* arid the community. 
; The evaluator and the manager must pift their joint effort into ^ 

creative and appropriate means to cbmmuriicate Jhe findings. Evaluations presented in a positive 
i light can do much to help a program gain support and evolve into a more effective resource for 

the prevention of drug abuse. 

b Although the jpolitics which surround evaluations can be a set of thorny problems, they can also be 
a source of opportunities. If the manager and evaluator work together to Tace these issues with 
appropriate planning and full awareness of the political context, the program, if actually 
; effective, should be able to maximize public and funding support. - % 



The author wishes to share his i appreciation to his colleague, Robert Emrich, of the General Electric 
Company, for his wise observations on the topics discussed in this chapter. • 
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